Some thoughts on us overlooking Illumos's syseventadm

July 24, 2020

In a comment on my praise of ZFS on Linux's ZFS event daemon, Joshua M. Clulow noted that Illumos (and thus OmniOS) has an equivalent in syseventadm, which dates back to Solaris. I hadn't previously known about syseventadm, despite having run Solaris fileservers and OmniOS fileservers for the better part of a decade, and that gives me some tangled feelings.

I definitely wish I'd known about syseventadm while we were still using OmniOS (and even Solaris), because it would probably have simplified our life. Specifically, it probably would have simplified the life of our spares handling system (2, 3). At the least, running immediately when some sort of pool state change happened would have sped up its reaction to devices failing (instead, it ran every fifteen minutes or so from cron, creating a bit of time lag).

(On the whole it was probably good to be forced to make our spares system be state based instead of event based. State based systems are easier to make robust in the face of various sorts of issues, like dropped events.)

At the same time, that we didn't realize syseventadm existed is, in my mind, a sign of problems in how Illumos is organized and documented (which is something it largely inherited from Solaris). For instance, syseventadm is not cross referenced in any of the Fault Manager related manpages ( fmd, fmdump, _fmadm, and so on). The fault management system is the obvious entry point for a sysadmin exploring this area on Illumos (partly because it dumps out messages on you), so some sort of cross reference would have led me to syseventadm. Nor does it come up much in discussions on the Internet, although if I'd asked specifically back in the days I might have had someone mention it to me.

(It got mentioned in this Serverfault question, for example.)

A related issue is that in order to understand what you can do with syseventadm, you have to read Illumos header files (cf). This isn't even mentioned in the syseventadm manpage, and the examples in the manpage are all for custom events generated by things from a hypothetical third party vendor MYCO instead of actual system events. Without a lot of context, there are not many clues that ZFS events show up in syseventadm in the first place for you to write a handler for them. It also seems clear that writing handlers is going to involve a lot of experimentation or reading the source to determine what data you get and how it's passed to you and so on.

(In general and speaking as a sysadmin, the documentation for syseventadm doesn't present itself as something that's for end sysadmins to use. If you have to read kernel headers to understand even part of what you can do, this is aimed at system programmers.)

On the whole I'm not terribly surprised that we and apparently other people missed the existence and usefulness of syseventadm, even if clearly there was some knowledge of it in the Illumos community. That we did miss it while ZFS on Linux's equivalent practically shoved itself in our face is an example of practical field usability (or lack thereof) in action.

At this point interested parties are probably best off writing articles about how to do things with syseventadm (especially ZFS things), and perhaps putting it in Illumos ZFS FAQs. Changing the structure of the Illumos documentation or rewriting the manpages probably has too little chance of good returns for the time invested; for the most part, the system documentation for Illumos is what it is.

Written on 24 July 2020.
« C's main() is one of the places where Unix's user and kernel APIs differ
My varied types of Firefox windows »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Jul 24 00:21:02 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.