I'm angry that ZFS still doesn't have an API

April 2, 2014

Yesterday I wrote a calm rational explanation for why I'm not building tools around 'zpool status' any more and said that it ended up being only half of the story. The other half is that I am genuinely angry that ZFS still does not have any semblance of an API, so angry that I've decided to stop cooperating with ZFS's non-API and make my own.

(It's not the hot anger of swearing, it's the slow anger of a blister that keeps reminding you about its existence with every step you take.)

For at least the past six years it has been blindingly obvious that ZFS should have an API so that people could build additional tools and solutions on top of it. For all that is sane, stock ZFS doesn't even have an alerting solution for pool problems. You can't miss that unless you're blind and say whatever you want about the ZFS developers, I'm sure that they're not blind. I am and have been completely agnostic about the exact format that this API could have taken, so long as it existed. Stable, documented, script-friendly output from ZFS tools? A documented C level library API? XML information dumps because everyone loves XML? A web API? Whatever. I could have worked with any of them.

Instead we got nothing. We got nothing when ZFS was with Sun and despite some vague signs of care we continue to get exactly nothing now that ZFS is effectively with Illumos (and I'm pretty sure that Oracle hasn't fixed the situation either). At this point it is clear that the ZFS developers have different priorities and in an objective sense do not care about this issue.

(Regardless of what you say, what you actually care about is shown by what you work on.)

This situation has thoroughly gotten under my skin now that moving to OmniOS is rubbing my nose in it again. So now I'm through with tacitly cooperating with it by trying to wrestle and wrangle the ZFS commands to do what I want. Instead I feel like giving 'zpool status' and its friends a great big middle finger and then throwing them down a well. The only thing I want to use them for now is as a relatively authoritative source of truth if I suspect that something is wrong with what my own tools are showing me.

(I call zpool status et al 'relatively authoritative' because it and other similar commands leave things out and otherwise mangle what you are seeing, sometimes in ways that cause real problems.)

I will skip theories about why the ZFS developers did not develop an API (either in Sun or later), partly because I am in a bad mood after writing this and so am inclined to be extremely cynical.


Comments on this page:

By James (trs80) at 2014-04-02 09:56:30:

The clear answer is the API was developed for Fishworks (the Sun ZFS storage appliances) and never backported to Solaris, because why would Sun want to let people make their own storage servers when you could buy Sun's?

One sidenote is the Fishworks shell was written in JavaScript, well before node.js popularized server-side JS.

By csby54 at 2014-04-02 13:47:55:

There is/was at least some upstream movement in the right direction http://blog.delphix.com/matt/2012/01/17/the-future-of-libzfs/

By cks at 2014-04-02 16:02:35:

Unfortunately the upstream movement seems to have either stalled or not gone anywhere. That entry you linked is from 2012 and I haven't heard of anything meaningful since then.

The other problem is that the sort of libzfs work that the entry is talking about is not a useful API for outside things because it doesn't stabilize what are currently internal pool properties like vdev and resilver statistics, error counts and information, and so on. A stable way of talking to the kernel to get information (or do actions) is not truly helpful if the information itself is not stable (which it very much has not been).

By Peter at 2014-04-29 18:55:59:

stock ZFS doesn't even have an alerting solution for pool problems - maybe I misunderstand, but this article from 2010 seems to treat your pain: http://www.c0t0d0s0.org/archives/7053-New-Solaris-features-Notifications-from-the-Fault-Management-Architecture.html.

Written on 02 April 2014.
« I'm done with building tools around 'zpool status' output
The scariness of uncertainty »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Apr 2 00:12:03 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.