Backups versus redundancy

May 25, 2009

By now, everyone knows that redundancy, for example RAIDed disks, is not the same thing as backups (or at least I hope they do). But I think it's worthwhile to talk about why, and thus the fundamental difference between the two.

(Note that there are a lot of ways of getting redundancy, far beyond mere RAID. If you are a large operation like Google or a bank, you're interested in redundancy across systems and even data centers.)

The two are quite similar in that both protect you against physical hardware failures, which are usually the major failure mode that people worry about. Yes, you can argue that redundancy commonly provides less protection against major disasters like fires, but this is not intrinsic to redundancy, just to how it's usually set up (and how backups are best set up); it's certainly possible to do things like have your RAID mirrors physically remote, or for that matter to put your backups on a shelf in the machine room. (Or in your office in the same building.)

The crucial difference is that backups provide history and redundancy does not. This means that while redundancy protects you against hardware failures, it does not help you against mistakes. To recover from a mistakes, you need the ability to reach back in time to before the mistake was made; ie, you need history.

(History, or the lack of it, is thus the dividing line between whether you have a backup system or merely a system for (potentially delayed) redundancy.)

So, do you actually need history? That's a serious question. There are a lot of systems where the answer is no for various reasons; for example, you might already have the history in some other form. For example, consider a web server where everything is deployed from a version control system; backing up the web server instead of just giving it however much redundancy it needs is unnecessary overhead.

(The tricky issue to worry about here is that corruption is a form of 'mistake'. But you may already be taking backups of the version control system itself.)

Comments on this page:

From at 2009-05-26 02:30:23:


But there are still plenty of people who do not know this. You can read almost daily about incidents that have happened even in relatively large data centers because of the false premises involving RAID and backups.

[irony] Sometimes I think the world would be a better place without the whole invention of Redundant Array of Inexpensive Disks (Patterson et. al. 1988). [/irony]

- j.

From at 2009-05-26 11:03:53:

I had a client once tell me he didn't need backups for his mysql database because he had set up replication. He was so proud of its elaborateness, claimed that the data sync was always within a few seconds even though the replicated db was hundreds of miles from the master.

"A few seconds?", I say. "Absolutely." "How long do you think it would take a 'drop table users' to replicate?" -silence-

He has since implemented a sensible backup solution.

Written on 25 May 2009.
« An interesting bit of ssh and sshd behavior
Hosted servers, cloud computing, and backups »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon May 25 22:58:15 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.