ZFS's problem with boot time magic

March 5, 2014

One of the problems with ZFS (on Solaris et al) is that in practice it involves quite a bit of magic. This magic is great when it works but is terrible when something goes wrong, because it leaves you with very little to work with to diagnose and fix your problems. Most of this magic revolves around the most problematic times in the life of ZFS, that being system shutdown and startup.

I've written before about boot time ZFS pool activation, so let's talk about how it would work in a non-magical environment. There are essentially two boot time jobs, activating pools and then possibly importing filesystems from the pools. Clearly these should be driven by distinct commands, one command to activate all non-active pools listed in /etc/zfs/zpool.cache (if possible) and then maybe one command to mount all unmounted ZFS filesystems. You don't really need the second command if pool activation also mounts filesystems the same way ZFS import does, but maybe you don't all of that happening during (early) boot and would rather defer both mounting and sharing until later.

ZFS on Solaris doesn't work this way. There is no pool activation command; pools just magically activate. And as I've found out, pools also apparently magically mount all of their filesystems during activation. While there is a 'zfs mount -a' command that is run during early boot (via /lib/svc/method/fs-local), it doesn't actually do what most people innocently think it does.

(What it seems to do in practice is mount additional ZFS filesystems from the root pool, if there is a root pool. Possibly it also mounts other ZFS filesystems that depend on additional root pool ZFS filesystems.)

I don't know where the magic for all of this lives. Perhaps it lives in the kernel. Perhaps it lives in some user level component that's run asynchronously on boot (much like how Linux's udev handles devices appearing). What I do know is that there is magic and this magic is currently causing me a major amount of heartburn.

Magic is a bad idea. Magic makes systems less manageable (and kernel magic is especially bad because it's completely inaccessible). Unix systems have historically got a significant amount of their power by more or less eschewing magic in favour of things like exposing the mechanics of the boot process. I find it sad to have ZFS be a regression on all of this.

(There is also regressions in the user level commands. For example, as far as I can see there is no good way to import a pool without also mounting and sharing its filesystems. These are actually three separate operations at the system level, but the code for 'zpool import' bundles them all together and provides no options to control this.)

Written on 05 March 2014.
« The multiple levels of interprocess communication
A bit more about the various levels of IPC: whether or not they're necessary »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Mar 5 00:22:15 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.