Practical RAID-1 read balancing

February 28, 2006

If you're doing performance analysis of a RAID-1 setup, one of the interesting questions is 'which drive gets read from when?'

(Measuring how balanced the current IO load is is useful, but it doesn't tell you how a changed load will affect the balance.)

Since seeks are the expensive thing on modern drives, you want your RAID-1 system to deal in requests, not in individual blocks, and to send a sequence of sequential reads off to the same disk. If your IO isn't strictly sequential, ideally the system keeps track of the last known head position for each disk (influenced by writes as well as by reads), and issues to the disk that is some combination of positioned closest and least loaded.

But all of this is theory. What you really need to do is measure, because you can never be sure just what a RAID-1 system is doing. Sometimes it can really surprise you.

For example, I once dealt with a reasonably fancy hardware RAID-10 controller where which disk a read would go to was statically determined, not based on load or anything. The controller divided the RAID-10 array up into slices (64K at the time); reads from odd slices went to the first disk in a mirror pair and reads from even slices went to the second. In extreme cases, half the array could be completely idle while the other half was melting down. (Presumably that didn't happen too often.)

We only found this out by running a test program against an otherwise idle test array and watching the front panel disk lights, as part of trying to sort out how the array distributed IO over the drives in general. Much to our surprise, in one test half the disks went active and the other half didn't. All I can say is thank goodness for front panel lights.

Sidebar: RAID-10 versus RAID-01

Because I always get them confused: RAID-10 is striped mirrors, RAID-0 on top of RAID-1. RAID-01 is the reverse, mirrored stripes, and is much less resilient against failures of two or more disks; RAID-01 dies unless all failed drives are in the same stripe, whereas RAID-10 will survive unless both sides of a single mirror die.

Written on 28 February 2006.
« A sad day for SGI: it's now a spammer
Unicode is not simple »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Feb 28 23:10:30 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.