How ZFS helps out with the big RAID-5 problem

July 13, 2008

It's time for me to say something nice about ZFS for a change, because ZFS can make the big RAID-5 problem significantly less of a problem for many people. ZFS offers two significant advantages:

  • because it knows what parts of the array actually contain live data, it doesn't need to read all of the disks. Less data read means less chance of an unrecoverable read error.

    (How much of an improvement this is depends on how full your pools are; if you routinely run with very full pools, you are reading most of your disks anyways.)

  • ZFS has mechanisms for identifying and tracking damaged files, so even if you hit an unrecoverable read error you will not loose the entire pool, just the affected file(s). Since ZFS defaults to making multiple copies of filesystem metadata (even in raidz pools), you may not even lose anything if you are lucky enough to have the UER hit a directory or the like, instead of an actual file.

(One reason that many RAID-5 implementations give up and declare the entire array dead if they hit a UER during array reconstruction is that they have no mechanisms for recording that part of the array is damaged; either they pretend that the array is entirely healthy or they kill it entirely, and they opt for the latter for 'safety'. As the chance for a UER during reconstruction rises, this may change.)

I think that the ZFS people would still strongly suggest that you limit your raidz pool sizes, use raidz2, or both, but at least ZFS gives you better odds if you have to run with raidz instead of raidz2.

(As an aside, it is worth noting that this is one place where RAID-6 is clearly better than RAID-5 plus a hot spare for the same number of disks, as covered in the last entry.)


Comments on this page:

From 131.251.5.84 at 2008-07-16 05:22:48:

One other big win is when resilvering your RAIDZ, ZFS only needs to recopy your data, not the entire partition. Saves ages if you have big disks.

 - rasputnik : http://number9.hellooperator.net/
Written on 13 July 2008.
« The problem with big RAID-5 arrays
What some fdisk options actually do »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Sun Jul 13 23:29:44 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.