Wandering Thoughts archives

2013-01-22

Disaster recovery preparation is not the same as a DR plan

One of the things that I think happens in the general area of disaster recovery is that the terminology gets confused (partly because a lot of people, myself included, sort of have their toes near the water without actually being specialists). Because some of my views about disaster recovery plans turn on what I consider a DR plan to involve, I want to write down my views.

(I've sort of mentioned them in passing a couple of times recently but I feel like making things explicit.)

So I'll start with my view of the terminology:

  • DR procedures are explicit, more or less step by step documentation that you could follow to bring your systems back up after a disaster. The archetype of DR procedures is big binders written up by big companies, where an unfamiliar person is supposed to be able to follow all of the steps to restore service. Sometimes the companies stage exercises to test this (often comedy ensues).

  • real DR plans are the high level versions of DR procedures, omitting all of the voluminous details and command lines in favour of general descriptions. A real DR plan is still a specific plan, but it needs to be carried out by you or another experienced local sysadmin who can fill in all of the blanks. As a specific plan, a real DR plan still details things like what will be restored or brought up where. You know how many servers (physical or virtual) you will use, located where, bought or rented with what money, connected to the network how, and so on.

  • I don't have a good name for the next step: call it an abstract or aspirational disaster recovery plan. In an abstract DR plan, you consider and document issues like what services are crucial (and what their dependencies are), options for how you might bring up partial services, how many machines you might need and where you might put them, and so on. However you do not have specifics; you are just thinking ahead to what you'll probably need if a diaster struck.

    This probably shades into (non-detailed) business continuity planning.

  • disaster recovery preparation are general steps that you take to try to make sure you can recover from a disaster. Offsite backups, offsite copies of crucial systems or information, paper documentation of your systems, and writeups of what you would want to restore first in order to bootstrap your environment from the ground up are all disaster recovery preparations.

My feeling is that some degree of DR preparation is easy and relatively general; you don't have to consider very many scenarios in order to set up things like offsite backups. I have mixed feelings about abstract DR plans, part of which boil down to 'documentation needs to be tested' (which is obviously hard for an abstract plan). Actual real and concrete DR plans have a lot of requirements that I think makes them hard for many organizations; among other things they need preallocated resources and for their efforts to be meaningful.

(Note that organizations that take their DR plans and procedures seriously do periodically run tests of them; you really have to, in order to be sure that the plans will work when they're really needed. If your organization 'takes DR seriously' but has never done or budgeted for such a test, you know how seriously it really takes this.)

I have a grumpy reaction to people who go on about how everyone should have DR plans or at least seriously consider DR issues because I think their efforts are driving people away from simple disaster recovery preparations. If you phrase things so that you bundle DR preparation into DR plans and then your description of what's involved in DR planning convinces people that it is too big (and too expensive) for their environment, you are not doing them any favours.

(Of course all of the consultants and DR firms and so on make almost all of their money from actual relatively concrete DR plans, not from simple DR preparation, so they have very little motivation to separate the two things and advise people start with simple steps before going all the way to expensive DR activities. But I am starting to rant here.)

Sidebar: the cynical reason for DR plans to exist

Put simply, DR plans are a blame deflection method for when disaster strikes and things explode. If you ordered your underlings to prepare a DR plan and the DR plan fails, you can generally deflect the resulting blame on to your underlings. In the mean time you can assure the auditors (and your management, if any) that you've considered the issue and you have a plan, honest.

(As always and as before, an organization's actual priorities are shown by what it does, not by what it says that it wants.)

sysadmin/DisasterRecoveryPrepAndPlans written at 00:25:16; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.