The future of OmniOS here if we can't get 10G-T working on it

December 22, 2014

When I wrote about our long road to getting 10G in production on OmniOS after our problems with it, I mentioned in an aside that the pessimistic version of when we might get our new fileserver environment back to 10G was 'never' and that that would have depressing consequences. Today I've decided to talk about them.

From the start, one of my concerns with Illumos has been hardware support. A failure to get our OmniOS fileservers back to 10G-T would almost certainly be a failure of hardware support, where either the ixgbe driver didn't get updated or the update didn't work well enough. It would also be specifically a failure to support 10G. Both have significant impacts on the future.

We can, I think, survive this generation of fileservers without 10G, although it will hurt (partly because it makes 10G much less useful in other parts of our infrastructure and partly because we spent a bunch of money on 10G hardware). I don't think we can survive the next generation without 10G; in four years 10G-T will likely be much more pervasive and I'm certainly hoping that big SSDs will be cheap enough that they'll become our primary storage. SSDs over 1G networking is, well, not really all that attractive; once you have SSD data rates, you really want better than 1G.

That basically means the next generation of fileservers could not be OmniOS (maybe unless we do something really crazy); we would have to move to something we felt would give us 10G and the good hardware support we hadn't gotten from Illumos. The possibility of going to a non-Illumos system in four years obviously drains off some amount of interest in investing lots of time in OmniOS now, because there would be relatively little long term payoff from that time. The more we think OmniOS is not going to be used in the next generation, the more we'd switch to running OmniOS purely in maintenance mode.

To some extent all of this kicks into play even if we can move OmniOS back to 10G but just not very fast. If it takes a year or two for OmniOS to get an ixgbe update, sure, it's nice to be running 10G-T for the remainder of the production lifetime of these fileservers, but it's not a good omen for the next generation because we'd certainly like more timely hardware support than that.

(And on bad omens for general hardware support, well, our version of OmnioOS doesn't even seem to support the 1G Broadcom ports on our Dell R210 servers.)

Sidebar: I'm not sure if Illumos needs more development in general

I was going to say that lagging hardware support could also be a bad omen for the pace of Illumos development in general, but I'm actually not sure if Illumos actually needs general development (from our perspective). Right now I'm assuming that the great 'ZFS block pointer rewrite' feature will never happen, and I'm honestly not sure if there's much other improvements we'd really care very much about. DTrace, the NFS server, and the iSCSI initiator do seem to work fine, I no longer expect ZFS to get any sort of API, and I don't think ZFS is missing any features that we care about very much (and we haven't particularly tripped over any bugs).

(ZFS is also the most likely thing to get further development attention and bugfixes, because it's currently one of the few big killer features of Illumos for many people.)

Written on 22 December 2014.
« Why Go's big virtual size for 64-bit programs makes sense
The security effects of tearing down my GRE tunnel on IPSec failure »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Dec 22 23:44:57 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.