When you don't want RAID-5

September 6, 2007

Here's a paradox that we only realized recently: in some situations, using RAID-5 can be less reliable overall than using no RAID at all.

This comes about because while RAID-5 preserves your data over a single-drive failure, it loses all your data if there is ever a double drive failure.

Our specific case is a disk-based incremental backup system. Right now a day's backups take up about a third of a disk, and we have it set up so each day's backups go to a different disk (eventually cycling around). The older a backup is the less useful it is. If we lose the disk with yesterday's incrementals we will still have several previous days, so we are not too bad off even after the worst single disk failure (and losing older disks is less damaging). If we lose two disks we are much better off than with RAID-5, since we still have all the remaining backups and thus can (at worst) get back three days ago.

And of course, not using RAID-5 gets us three more days of online incrementals.

(This is not our only backup system; we do less frequent backups to tape. These have the full backups that serve as the baseline for the incrementals.)

What makes this situation work is that losing some of the data is not really a fatal thing while losing all of the data would be fairly alarming, combined with the fact that we can fit each 'unit' of data on a single disk.

Written on 06 September 2007.
« Where to find specifications on HTTP POST behavior
My view of what 'strongly typed' means »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Sep 6 23:13:14 2007
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.