2010-10-24
Why we built our own ZFS spares handling system
I mentioned recently that we've written our own system to handle ZFS spares. Before I describe it, I wanted to write up something about why we decided to go to the extreme measure of discarding all of ZFS's own spare handling and rolling our own.
First off, note that our environment is unusual. We have a lot of pools and a relatively complex SAN disk topology with at least three levels, as opposed to the more common environment of only a few pools and essentially undifferentiated disks. I expect that ZFS's current spares system works much better in the latter situation, especially if you don't have many spare disks.
Our issues with ZFS's current spare system include:
- it has outright bugs with shared spares, some of them fixed and others
not (we had our selfish pool, for example).
- because of how ZFS handles spares, we've seen
ZFS not activate spares in situations where we wanted them activated.
- ZFS has no concept of load limits on spares activations. This presents
us with an unenviable tradeoff; either we artificially limit the number
of spares we configure or we can have our systems crushed under the load
of multiple simultaneous resilvers.
(We've seen this happen.)
- ZFS doesn't know how we want to handle the situation where there are
too few spares to replace all of the faulted disks; instead it will
just deploy spares essentially randomly. (This also combines with the
above issue, of course.)
- there's no way to tell ZFS about our multi-level disk topology, where there are definitely good and bad disks to replace a given faulted disk with.
Many of these are hard problems that involve local policy decisions, so I don't expect ZFS to solve them out of the box. Instead ZFS's current spares system deals with the common case; it just happens that the common case is not a good fit for our environment.
(I do fault ZFS for having no support for this sort of local additions. I don't necessarily expect a nice modular plugin system, but it would be nice if ZFS had official interfaces for extracting information in ways that are useful for third party programs. But that's really another entry.)
Why I'm interested in Go
These days, my substantial programming takes place in one of two languages. I use Python if I'm dealing with something that doesn't have to run fast or use minimal memory, and I use C on the occasions when Python doesn't fit. I like C, but it is a very sparse and unforgiving language when compared to Python and this translates to slower and more annoying development most of the time; there's a lot of low level details that I have to worry about when I just want to bang out some code that runs fast(er).
(This annoyance leads me to use Python for everything that it can even barely be made to fit.)
There are times when C is the right answer, but there are also a lot of times when what I want is in the middle between C and Python, problems where Python is too heavyweight and bare C is too low level. Part of why I'm interested in Go is because it seems to be the most promising candidate to fit in this niche on Unix. It promises relatively fast runtime and relatively little memory usage while still giving me garbage collection, convenient strings, hashes, and arrays, and a decent set of support modules. For me, this makes Go the attractive choice for writing various system level programs like non-trivial network daemons.
(There will be C libraries for all of the packages that Go comes with, but I'd have to go find them and that leads to the selection problem. And in general, syntax matters and Go has better syntax than C plus libraries.)
Another part of why I'm interested in Go is that it comes from a group of people (call them the Plan 9 crowd) that have created a whole bunch of interesting, good ideas that I've found attractive in the past. Sometimes the results are too purist for my tastes, but they've pretty much always been worth a look. And the way I look at languages is actually using them for real work, so I have gone and dabbled in Go; ideally I would like to use Go to write a relatively substantial program that I'll actually use.
(It has to be a personal program, since I won't write production programs in obscure languages that my co-workers have never heard of and that don't (yet) come packaged on our Unix systems.)
Sidebar: why other candidates are out
Java is, right now, not a particularly great language to write Unix system programs in for at least two reasons. First, Java programs generally start slowly for the same reason that Python programs start slowly; they have to load and start the interpreter before they actually start the program. Second, my impression is that the JVM does not provide good access to Unix systems facilities.
(These drawbacks apply to any number of interesting languages built on top of the JVM, and in general to any interpreter-based language. Slow startup is not an issue for long-running programs, but I don't write that many of them.)
The D language struck me as sort of interesting when I first heard about it but it doesn't seem to have cohered into a useful system (rather the reverse, in fact), and I'm not really interested in languages that aren't open source because there's very little chance that they will become popular on my platform of choice.
C++ has many of C's problems as far as language features go, just somewhat nicer syntax once I find libraries and packages and so on that do what I want. And it doesn't have native garbage collection, which is one of the great programming speed accelerators.