2014-03-31
I'm done with building tools around 'zpool status
' output
Back when our fileserver environment was young,
I built a number of local tools and scripts that relied on 'zpool
status
' to get information about pools, pool states, and so on. The
problem with using 'zpool status
' is of course that it is not an API,
it's something intended for presentation to users, and so as a result
people feel free to change its output from time to time. At the time
using zpool
's output seemed like the best option despite this, or more
exactly the best (or easiest) of a bad lot of options.
Well, I'm done with that.
We're in the process of migrating to OmniOS. As I've had to touch
scripts and programs to update them for OmniOS's changes in the output
of 'zpool status
', I've instead been migrating them away from using
zpool
at all in favour of having them rely on a local ZFS status
reporting tool. This migration isn't complete
(some tools haven't needed changes yet and I'm letting them be), but
it's already simplified my life in various ways.
One of those ways is that now we control the tools. We can guarantee
stable output and we can make them output exactly what we want. We
can even make them output the same thing on both our current Solaris
machines and our new OmniOS machines so that higher level tooling is
insulated from what OS version it's running on. This is very handy and
not something that would be easy to do with 'zpool status
'.
The other, more subtle way that this makes my life better is that I now
have much more confidence that things are not going to subtly break on
me. One problem with using zpool
's output is that all sorts of things
can change about it and things that use it may not notice, especially
if the output starts omitting things to, for example, 'simplify' the
default output. Since our tools are abusing private APIs they may well
break (and may well break more than zpool
's output), but when they
break we can make sure that it's a loud break. The result is much more
binary; if our tools work at all they're almost certainly accurate. A
script's interpretation of zpool
's output is not necessarily so.
(Omitting things by default is not theoretical. In between S10U8 and
OmniOS, 'zfs list
' went from including snapshots by default to
excluding them by default. This broke some of our code that was parsing
'zfs list
' output to identify snapshots, and in a subtle way; the
code just thought there weren't any when there were. This is of course
a completely fair change, since 'zfs list
' is not an API and this
probably makes things better for ordinary users.)
I accept that rolling our own tools has some additional costs and has
some risks. But I'd rather own those costs and those risks explicitly
rather than have similar ones arise implicitly because I'm relying on a
necessarily imperfect understanding of zpool
's output.
Actually, writing this entry has made me realized that it's only half of the story. The other half is going to take another entry.
Why I sometimes reject patches for my own software
I recently read Drew Crawford's Conduct unbecoming of a hacker (via), which argues that you should basically always accept other people's patches for your software unless they are clearly broken. Lest we bristle at this, he gives the example of Firefox and illustrates how many patches it accepts. On the whole I sympathize with this view, and I've even had some pragmatic experience with it; a patch to mxiostat that I wasn't very enthusiastic about initially has actually become something I use routinely. But despite this there are certain sorts of patches I will reject basically out of hand. Put simply they're patches that I think will make the program worse for me, no matter how much they might help the author of the patch (or other people).
This is selfish behavior on my part, but so far all of my public software is things that I'm ultimately developing for myself first. It's nice if other people use my programs too but I don't expect any of them to get popular enough that other people's usage is going to be my major motivation for maintaining and developing them. So my priorities come first and the furthest I'm willing to go is that I'll accept patches that don't get in the way of my usage.
(Drew Crawford's article has sort of convinced me that I should be more liberal about accepting patches in general; he makes a convincing case for 'accept now, bikeshed later'. So far this is mostly a theoretical issue for my stuff.)
By the way, this would obviously be different if I was developing things with the explicit goal of having them used by other people. In that case I should (and hopefully would) suck it up and put the patch in unless I had strong indications that it would make the program worse for a bunch of people instead of just me. Maybe someday I'll write something like that, but so far it's not the case.