What I know about how ZFS actually handles spare disks
Like many other RAID-oid systems, ZFS has a notion of spare disks;
you can add one or more spare disks to a pool, and ZFS will use them
as necessary in order to maintain pool redundancy in the face of disk
problems. For details, you can see the zpool
manpage.
Well, sort of. Actually, how ZFS handles spare disks is significantly
different from how normal RAID systems handle them, and the pleasantly
bland and normal description of spares in the zpool
manpage elides a
significant number of important things. The following is what I have
been able to gather about the situation from various sources (since Sun
doesn't seem to actually document it).
In a traditional RAID system with spares, spare handling is part of
the main RAID code in the kernel, with spares activated automatically
when needed. In Solaris this is not the case; the only thing that the
kernel ZFS code does is keep track of the list of spares and some state
information about them. Activating a spare is handled by user-level
code, which issues the equivalent of 'zpool replace <pool> <old-dev>
<spare-dev>
' through a library call. Specifically, activating ZFS
spares is the job of the zfs-retire agent of fmd, the Solaris fault
manager daemon.
(Once zfs-retire activates the spare, the ZFS kernel code handles the
rest of the process, including marking the spare in use and setting up
the special 'this device is replaced with a spare' vdev. This means
that you can duplicate a spare activation by doing a 'zpool replace
' by
hand if you ever want to.)
In theory, using fmd
for this is equivalent to doing it all in
the kernel. In practice, your ZFS spare handling is at the mercy of
everything working right and it doesn't always do so. For one prominent
example, it is up to the zfs-retire module to decide what should cause
it to activate a spare, and it has not always done so for everything
that degrades a ZFS vdev.
My primary sources for all of this are this Eric Shrock entry and the archives of the zfs-discuss mailing list. Examination of the OpenSolaris codebase has also been useful (although if you are tempted to do this, beware; it does not necessarily correspond with Solaris 10).
Sidebar: what is required for spare activation
In order for a spare to be activated, a great many moving parts of your system have to all be working right. I feel like writing them down (at least the ones that I can think of):
fmd
has to be runningfmd
has to be getting (and generating) relevant events, which may require variousfmd
modules to be working correctly- the
zfs-retire
agent has to be working, and to have subscribed to those events zfs-retire
has to decide that the event is one that should cause it to activate a spare.zfs-retire
has to be able to query the kernel (I think) to get the problem pool's configuration in order to find out what spares are available. (This can fail.)zfs-retire
has to be able to issue the necessary 'replace disk' system call.
A further side note on events: in an ideal world, there would be a 'ZFS vdev <X> has been degraded because of device <Y>' event that zfs-retire would listen for. If you think that Solaris lives in this world, I have bad news for you.
|
|