Why I'm not looking for any alternatives to iSCSI for us

October 15, 2013

Broad scale distributed storage systems such as Ceph are an in thing these days (at least in some quarters). A while back a commentator on this entry suggested looking at them as an alternative to our use of iSCSI and I've been mulling over my reaction since then. Let me put it simply: my reaction is strongly negative. The short reason why is that I see no compelling benefits and all alternatives appear to involve more complexity and magic.

Let's assume that the setuid issue can be dealt with somehow (this is a basic prerequisite). First off, it's worth noting that ZFS plus iSCSI plus backends involves completely commodity hardware and (with Illumos) completely open source software; moving to something like Ceph gives no benefits there.

Our current ZFS plus iSCSI environment has simple components where we understand and can predict (at some level) basically everything that is going on. The distribution of data over physical backends and physical disks is not completely predictable (ZFS pools smear data across all of their components in somewhat unpredictable ways) but it is relatively so, as is the performance of the resulting bits. This is a feature for us. We very much do not want a big black box where magic happens and people's data is distributed over, well, something, somewhere.

I do not want to say that Ceph or other distributed storage systems are going to be black boxes, because I suspect that they aren't and I certainly don't have the experience to say one way or another. But what I can say is that I don't see any way in which they're going to be simpler than our current environment. No matter how you slice it we need filesystems inside pools of storage (that are fixed size but expandable) where those storage pools are mapped to some mirrored disk space. ZFS pools on disks is about as direct an expression of this as you can get and we know that it works and that we can manage it easily. I just don't see how a distributed storage system can do this even better, not without introducing magic that we don't want.

(Given the risks of switching from a known to work environment, it's not enough for a distributed storage system to be just as good as our current system. It must be better, and not just a little bit better; it should be substantially and visibly better.)

PS: I'm not saying that distributed storage systems have no use. I can certainly see situations where something like our ZFS plus iSCSI environment would become unmanageably complex and inflexible, for example. But we are not operating anywhere near that scale today or in the foreseeable future.

Sidebar: ease of use versus magic

It's possible to imagine a distributed storage system that makes our environment easier to manage at one level. You could have this cloud of storage, a storage pool management layer that insured that everything in it was mirrored, and a set of storage pools or filesystem groups on top of this (with quota or other size limits). Storage would be automatically managed and migrated and all sorts of good things.

The problem is that this system is much more magical and less predictable than our current environment. For instance, we might generally have no idea which storage pools or filesystems are using any particular chunk of storage because the system handles storage distribution for us. We don't consider this a feature, partly because we definitely want the ability to engineer our system so that certain sources of IO load are fenced off from other sources.


Comments on this page:

By James (trs80) at 2013-10-15 09:11:22:

Not that I am recommending it, but some people consider ATA over Ethernet better than iSCSI, including with ZFS.

I have to agree that the devil you know is worth a lot, particularly when you want it to just work and can't justify the investment for somewhat marginal improvements.

By cks at 2013-10-15 12:54:25:

A long time ago I looked at AoE and wound up feeling that the protocol had fundamental problems (written up here). At the time it certainly had also had practical performance and other issues with the initiator and target available at the time; those have hopefully been fixed by now, but I haven't looked at things since. My feeling is that AoE is highly unlikely to be better than iSCSI in our environment and I'm not interested in something that is merely at parity.

Written on 15 October 2013.
« The importance of small UI tweaks (for me), dmenu edition
Disused addresses and the impact of spam »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Oct 15 00:20:12 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.