Some thoughts on us overlooking Illumos's
In a comment on my praise of ZFS on Linux's ZFS event daemon, Joshua M. Clulow
noted that Illumos (and thus OmniOS) has an equivalent in
syseventadm, which dates back to
Solaris. I hadn't previously known about
having run Solaris fileservers and OmniOS
fileservers for the better part of a decade,
and that gives me some tangled feelings.
I definitely wish I'd known about
syseventadm while we were still
using OmniOS (and even Solaris), because it would probably have
simplified our life. Specifically, it probably would have simplified
the life of our spares handling system (2, 3). At the least,
running immediately when some sort of pool state change happened
would have sped up its reaction to devices failing (instead, it ran
every fifteen minutes or so from cron, creating a bit of time lag).
(On the whole it was probably good to be forced to make our spares system be state based instead of event based. State based systems are easier to make robust in the face of various sorts of issues, like dropped events.)
At the same time, that we didn't realize
syseventadm existed is,
in my mind, a sign of problems in how Illumos is organized and
documented (which is something it largely inherited from Solaris).
syseventadm is not cross referenced in any of the
Fault Manager related manpages (
fmdump, _fmadm, and so on). The fault management
system is the obvious entry point for a sysadmin exploring this
area on Illumos (partly because it dumps out messages on you), so some sort of cross reference would
have led me to
syseventadm. Nor does it come up much in discussions
on the Internet, although if I'd asked specifically back in the
days I might have had someone mention it to me.
(It got mentioned in this Serverfault question, for example.)
A related issue is that in order to understand what you can do with
syseventadm, you have to read Illumos header files (cf). This isn't even mentioned in
syseventadm manpage, and the examples in the manpage are
all for custom events generated by things from a hypothetical third
MYCO instead of actual system events. Without a lot
of context, there are not many clues that ZFS events show up in
syseventadm in the first place for you to write a handler for
them. It also seems clear that writing handlers is going to involve
a lot of experimentation or reading the source to determine what
data you get and how it's passed to you and so on.
(In general and speaking as a sysadmin, the documentation for syseventadm doesn't present itself as something that's for end sysadmins to use. If you have to read kernel headers to understand even part of what you can do, this is aimed at system programmers.)
On the whole I'm not terribly surprised that we and apparently other people missed the existence and usefulness of syseventadm, even if clearly there was some knowledge of it in the Illumos community. That we did miss it while ZFS on Linux's equivalent practically shoved itself in our face is an example of practical field usability (or lack thereof) in action.
At this point interested parties are probably best off writing articles about how to do things with syseventadm (especially ZFS things), and perhaps putting it in Illumos ZFS FAQs. Changing the structure of the Illumos documentation or rewriting the manpages probably has too little chance of good returns for the time invested; for the most part, the system documentation for Illumos is what it is.