solaris/ZFSZpoolStalls written at 23:02:04; Add Comment
Dear ZFS: please stop having your commands stall
One of my serious irritations with ZFS is how various ZFS commands (or
at least sub-commands of
(Worse, it hung uninterruptibly; I could not stop it with ^C, use job
control to background it, make it abort with ^\, or even
One of the really unfortunate effects of this is that it really hampers
my ability to do a lot of diagnostic work, because both
(It is possible that Solaris MPxIO is contributing to this, since our 'iSCSI' devices are actually the MPxIO versions, but as a sysadmin I don't care exactly why the ZFS commands stall, just that they do. The downside of Sun owning the entire stack is that they don't get to point fingers at anyone else.)
I believe that ZFS commands behave okay if the iSCSI machine is explicitly rejecting Solaris's connection attempts (or reporting that the target or the LUN doesn't exist or the like). What seems to be near-fatal is when the iSCSI target simply isn't responding. Unfortunately this is the most likely failure mode; switch failure, controller failure, controller rebooting, etc.
(You also get the same issue if the iSCSI target is responding very, very slowly, as I found out when our theoretically jumbo frame capable gigabit switch decided to switch jumbo frames so slowly that it had a bandwidth measured in kilobytes per second.)
* * *
Atom feeds are available; see the bottom of most pages.