What I know about how ZFS actually handles spare disks
Like many other RAID-oid systems, ZFS has a notion of spare disks;
you can add one or more spare disks to a pool, and ZFS will use them
as necessary in order to maintain pool redundancy in the face of disk
problems. For details, you can see the
Well, sort of. Actually, how ZFS handles spare disks is significantly
different from how normal RAID systems handle them, and the pleasantly
bland and normal description of spares in the
zpool manpage elides a
significant number of important things. The following is what I have
been able to gather about the situation from various sources (since Sun
doesn't seem to actually document it).
In a traditional RAID system with spares, spare handling is part of
the main RAID code in the kernel, with spares activated automatically
when needed. In Solaris this is not the case; the only thing that the
kernel ZFS code does is keep track of the list of spares and some state
information about them. Activating a spare is handled by user-level
code, which issues the equivalent of '
zpool replace <pool> <old-dev>
<spare-dev>' through a library call. Specifically, activating ZFS
spares is the job of the zfs-retire agent of fmd, the Solaris fault
(Once zfs-retire activates the spare, the ZFS kernel code handles the
rest of the process, including marking the spare in use and setting up
the special 'this device is replaced with a spare' vdev. This means
that you can duplicate a spare activation by doing a '
zpool replace' by
hand if you ever want to.)
In theory, using
fmd for this is equivalent to doing it all in
the kernel. In practice, your ZFS spare handling is at the mercy of
everything working right and it doesn't always do so. For one prominent
example, it is up to the zfs-retire module to decide what should cause
it to activate a spare, and it has not always done so for everything
that degrades a ZFS vdev.
My primary sources for all of this are this Eric Shrock entry and the archives of the zfs-discuss mailing list. Examination of the OpenSolaris codebase has also been useful (although if you are tempted to do this, beware; it does not necessarily correspond with Solaris 10).
Sidebar: what is required for spare activation
In order for a spare to be activated, a great many moving parts of your system have to all be working right. I feel like writing them down (at least the ones that I can think of):
fmdhas to be running
fmdhas to be getting (and generating) relevant events, which may require various
fmdmodules to be working correctly
zfs-retireagent has to be working, and to have subscribed to those events
zfs-retirehas to decide that the event is one that should cause it to activate a spare.
zfs-retirehas to be able to query the kernel (I think) to get the problem pool's configuration in order to find out what spares are available. (This can fail.)
zfs-retirehas to be able to issue the necessary 'replace disk' system call.
A further side note on events: in an ideal world, there would be a 'ZFS vdev <X> has been degraded because of device <Y>' event that zfs-retire would listen for. If you think that Solaris lives in this world, I have bad news for you.
Comments on this page:Written on 10 September 2009.