Wandering Thoughts archives

2010-03-24

One reason why 'zpool status' can hang

The 'zpool status' command is infamous for stalling and hanging exactly when you need it the most, namely when something is going wrong with your system. I've recently run down one reason why it does this.

The culprit is my old friend ZFS GUIDs. Information about disks involved in ZFS pools includes both their GUIDs and their theoretical device paths. When 'zpool status' prints out the user friendly shortened device names, it doesn't just take the theoretical path and trim most things off to get the device name; as I've alluded to before, it decides to be hyper-correct and check that the device named in the configuration really is the right device.

In theory this is simple, as Solaris has some system calls for doing pretty much all of the work. In practice these system calls require you to open the disk device that you want to check, and under some circumstances this open() will stall for significant amounts of time (several minutes, for example). An iSCSI target that isn't responding is one such circumstance.

If you've ever seen this happen, you might wonder why 'zpool status' hangs completely immediately before printing any pool configuration information, instead of getting to the point where it starts to print device names for affected devices. The answer is that 'zpool status' is helpfully extra-clever; before it prints out any pool configuration stuff, it pauses to work out how wide it has to make the device name column so that everything will line up nicely. This requires working out the friendly name of all devices, which requires that hyper-correct checking of the configuration, which stalls the entire process if any disk is very slow to open().

(For extra fun, this 'calculate the needed width' step also looks at the spare disks (if any), so a single bad spare disk, one that's not even in use, can cause 'zpool status' to stall on you.)

The 'zpool iostat' command also does the same extra-clever step of working out the maximum width of the device name column, so it will stall for the same reason. For bonus points, 'zpool iostat' does this every time it prints out a round of statistics. Yes, really. No wonder plain iostat is acres better if anything bad is going on.

By the way, this particular stall only happens if you have permissions to open the device in the first place, ie it only happens if you are root. So if you suspect ZFS problems, especially if you want 'zpool iostat' results, run the commands as a non-root user.

(This is not the only way that zpool status can stall; I've seen it stutter when it was trying to get the ZFS pool configuration from the kernel.)

Sidebar: where this is in the code

The zpool source code is usr/src/cmd/zpool, and the whole width calculation stuff is in zpool_main.c:max_width(). This calls zpool_vdev_name(), in lib/libzfs/common/libzfs_pool.c, which calls path_to_devid(), which actually open()'s the device. This check is thoughtfully guarded to make sure that it doesn't open devices that ZFS has gotten around to declaring are actually bad; sadly, ZFS makes such declarations long after open()'s of iSCSI target disks have started stalling for minutes at a time.

solaris/WhyZpoolStatusHangs written at 01:51:58; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.