Things I do and don't know about how ZFS brings pools up during boot
If you import a ZFS pool explicitly, through 'zpool import
', the
user-mode side of the process normally searches through all of the
available disks in order to find the component devices of the pool.
Because it does this explicit search, it will find pool devices
even if they've been shuffled around in a way that causes them to
be renamed, or even (I think) drastically transformed, for example
by being dd
'd to a new disk. This is pretty much what you'd expect,
since ZFS can't really read what the pool thinks its configuration
is until it assembles the pool. When it imports such a pool, I
believe that ZFS rewrites the information kept about where to
find each device so that it's correct for the current
state of your system.
This is not what happens when the system boots. To the best of
my knowledge, for non-root pools the ZFS kernel
module directly reads /etc/zfs/zpool.cache
during module
initialization and converts it into a series of in-memory pool
configurations for pools, which are all in an unactivated state.
At some point, magic things attempt to activate some or all of these
pools, which causes the kernel to attempt to open all of the devices
listed as part of the pool configuration and verify that they are
indeed part of the pool. The process of opening devices only uses
the names and other identification of the devices that's in the
pool configuration; however, one identification is a 'devid', which
for many devices is basically the model and serial number of the
disk. So I believe that under at least some circumstances the kernel
will still be able to find disks that have been shuffled around,
because it will basically seek out that model plus serial number
wherever it's (now) connected to the system.
(See vdev_disk_open
in vdev_disk.c
for the gory details,
but you also need to understand Illumos devids. The various device
information available for disks in a pool can be seen with 'zdb
-C <pool>
'.)
To the best of my knowledge, this in-kernel activation makes no
attempt to hunt around on other disks to complete the pool's
configuration the way that 'zpool import
' will. In theory, assuming
that finding disks by their devid works, this shouldn't matter most
or basically all of the time; if that disk is there at all, it
should be reporting its model and serial number and I think the
kernel will find it. But I don't know for sure. I also don't know
how the kernel acts if some disks take a while to show up, for
example iSCSI disks.
(I suspect that the kernel only makes one attempt at pool activation and doesn't retry things if more devices show up later. But this entire area is pretty opaque to me.)
These days you also have your root filesystems on a ZFS pool, the
root pool. There are definitely some special code paths that seem
to be invoked during boot for a ZFS root pool, but I don't have
enough knowledge of the Illumos boot time environment to understand
how they work and how they're different from the process of loading
and starting non-root pools. I used to hear that root pools were
more fragile if devices moved around and you might have to boot
from alternate media in order to explicitly 'zpool import
' and
'zpool export
' the root pool in order to reset its device names,
but that may be only folklore and superstition at this point.
|
|