The challenges of shared spares in RAID arrays

July 14, 2010

It's getting popular these days for RAID implementations to support what I've heard called 'shared spares'; spare disks that are shared between multiple RAID arrays, so that that they can be used by any array that happens to need them. This is an attractive idea because it gives you better protection against moderate problems than you could get with dedicated spares. (If you have large problems you run out of spares, of course.)

The problem with shared spares is that they are pretty much intrinsically hard to do well in the general case, once you get beyond simple configurations and start working at larger scales. I'll use our fileservers as an example.

Our fileservers have 'RAID arrays' (ZFS pools) of varying sizes that are made up of some number of mirror pairs from two different iSCSI backends per fileserver. Suppose a disk fails in some pool; clearly, if possible we want to replace that disk with another disk from the same iSCSI backend so that we maintain cross-backend redundancy.

Suppose that several disks fail at once, in a situation where we have too few suitable spares to restore all affected pools to full redundancy. In this situation we want as many pools as possible restored to full redundancy, as fast as possible; we'd rather have two smaller pools be fully redundant than one much larger pool be 2/3rds redundant (two out of three mirrors restored to full operation).

Large setups are like this: their disks don't have a flat topology, and they have policy issues surrounding what should be done in situations with limited resources or what should be prioritized first. I'm sure that you can support all of this in a general RAID shared spares system if you try hard enough, but you're going to have a very complex configuration system; it'll practically be a programming language.

(In theory issues of selecting the right spare disk just need a sufficiently smart general algorithm that knows enough or is told enough about the real disk topology. But policy issues of what gets priority can't be sorted out that way.)

Sadly, large systems with lots of RAID arrays are also exactly the situation where you want shared spares. From this I conclude that your shared spares system should be modular, so that sites have a place to plug in different and more sophisticated methods of selecting what disk to use and what RAID arrays to heal first (or at all).

Written on 14 July 2010.
« Sun Support's habit of publicizing private bug reports
Making the Linux kernel shut up about segfaulting user programs »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jul 14 00:31:32 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.