2013-07-26
Communication is work and a corollary
Hopefully it is not news to anyone that communication takes work (which is to say that it takes time, effort, and attention, all of which are limited resources). After all, Fred Brooks was pointing this out in The Mythical Man-Month as one of the reasons that adding people to a late project generally makes it later (because when you add people you increase the amount of communication and coordination that needs to be done, which decreases the amount of actual work that everyone has time for).
There is an important corollary to this, which I will phrase as follows:
Creating new communication channels is making work for everyone who will be part of them.
(Or more compactly, communication creates work.)
Every time you invent a new forum or private IRC channel or website or local mailing list and expect people to pay attention to it, you are sticking them with more work. Participating in the channel or even passively following it will take some of their time and attention.
The obvious consequence is that unless your people are boredly twiddling their thumbs for part of the day, using a new channel means that they are going to have to drop something else they are already doing. The less obvious consequence is that people are going to reject the new channel if it doesn't deliver actual value to them, value that is worth the time it's taking up.
(If you feel that your people are wasting some of their time or using it inefficiently, time that can be redeployed to following your new channel, you have a problem that is hopefully obvious when I phrase it that way.)
There is no exception to this because there is no way to have a new channel take up zero time. The only way out is to make your new channel improve people's overall efficiency somehow so that on net they have more time even after your new channel takes a bite out of it.
(This can certainly be done and there are plenty of examples, but it is tricky and easy to get wrong. Also, sometimes what the channel is achieving is important enough to make people devote the time to it anyways even at the expense of dropping or slowing down some other work.)
2013-07-14
Why single vendor solutions are a hard sell
Yesterday I wrote in passing that it would probably be very hard to persuade us to go for a single vendor solution for our fileserver needs instead of one built on open standards with replaceable components. Today I feel like justifying that casual aside, even though some people will consider it obvious.
The simple version is that a (single) vendor solution requires more trust. Everything is in the hands of the vendor, it is generally very hard to inspect and verify the solution from the outside, and you are completely at the mercy of the vendor's quality of implementation. With a system built on open standards with multiple sources of the components you only have to trust that the standards themselves are decent (which to some extent can actually be tested and verified) and that some collection of people will support the standards with decent implementations. If an individual component supplier is not good enough you can swap them out without destroying the entire plan.
Both parts of this are important. Genuinely open standards give you (and other people) some chance of evaluating whether the standard is actually fit for the purpose, can perform decently on stuff you can afford, and so on. They can also create more implementations that people are willing to talk about so you can find that yes, people have gotten them to work and work well. Replaceable components mean both that you have options if one part isn't up to what you want and that the people involved are hopefully being pushed by competition to improve things.
This shouldn't be a dogmatic approach, and indeed our current system only partially follows it. ZFS is not an 'open standard' and at the time we adopted it there was effectively only one vendor. On the other hand it was replaceable from a larger scale view and replacing it with something else wouldn't have required us to throw out any other parts of the overall system.
(That you don't need to throw out much or anything else if you have to replace one bit is of course a good sign of a modular system design.)
2013-07-13
What we need in our fileservers (in the abstract)
A few weeks ago, a commentator on one of my fileserver entries asked if we'd considered using Ceph instead of our ZFS plus iSCSI setup. My initial reaction was strongly and more less reflexively negative, but for various reasons I've been thinking about the general issues involved off and on since then (partly because the timing for it is singularly good, since we have to migrate our data anyways). One of the first steps of any sort of semi-objective evaluation of options is to come up with a list of what general, abstracted features we need in our fileservers.
This is my list so far:
- Full Unix permissions on the machines that people use; our users
would (rightfully) lynch us if we took away all of the things
that moving to a real distributed filesystem costs you. I think that this requires
NFS to the 'fileservers', whatever those are.
- Traditional Unix filesystem semantics (more or less) and
performance, again on the machines that people use. People are
going to run all sorts of general Unix programs against this file
service; they will not be happy if some of them suddenly don't
work or perform really badly because (for example) database-style
random write IO performs poorly.
We can't make any particular predictions about how people will use our fileservice. Some people will make light use of it. Some people will append to files a lot over and over again. Some people will do intensive stream IO. Some people will run databases on it, doing lots of random read and/or write IO. Some people will unpack and shuffle around huge trees of files (updating git repositories, unpacking tar files, or whatever). We can't tell them 'don't do that, our fileservers aren't built for it'.
- No single point of (server) hardware failure that can take down
storage. I think a good way to put this is that we should be able
to paper over any server dying even if we're working remotely and
can't touch the physical hardware (beyond forcing a hard power off).
Our current environment has this
property; we can fail over (virtual) fileservers if the physical
server dies and if an iSCSI backend dies we can re-mirror storage to
our hot spare one.
This implies that storage can't be tightly tied to 'fileservers' because a dead server would then require physical work to shift disks, connectors, or what have you to another server.
- Adding more storage and file service can't cost too much; we have
to always be able to buy them in relatively inexpensive units even
if we're full up at the moment. This implies that the individual
components can't be too expensive.
- It has to be possible to update and replace hardware without
the users noticing (possibly apart from minor downtime for some
changes). Migrating storage from one set of disks on one set of
machines to another set of disks on another set of machines should
not involve user-visible downtime. Nor should expanding the space
available to any particular set of users.
(The more changes and so on that can be done without users noticing, the better.)
- Two levels of space limitations and space reservations, however
that's accomplished. We need one level because for better or worse
our model of providing storage is to sell space to groups and
professors, which means that we need to be able to sell them X
amount of guaranteed space that they can use. We need a second
level within that in order to limit the size of 'backup entities'
and to reserve space for specific people within a group.
(Our current implementation uses ZFS pools and ZFS filesystems (with quotas and reservations) to provide the two levels.)
- We need to be able to allocate top-level space to people in units
smaller than 'one (replicated) physical disk'. Among other reasons,
physical disk sizes change over time.
(Today this is done by slicing physical disks into fixed-size logical chunks and exposing those chunks by iSCSI.)
- In short, flexible space management;
the 'filesystems' within the 'pools' should not need to have space
preallocated to them. That free space is shared between and flows
between filesystems in pools has been a major win in our current
environment because it means that people simply buy generic space
and don't have to carefully plan out where it goes and who gets
it and then re-balance it as needs and space usage changes. We
would be very reluctant to give this up.
- Data integrity checksums and resilient handling of disk errors.
To summarize the issue they need to be as good as ZFS's.
- Space allocation and IO patterns that we can understand and analyze.
We're not interested in shoveling a bunch of disks on storage servers
into a great big cloud and having theoretically generic fileservice
come out the other side; we need to be able to understand, control,
and monitor what 'pools' are putting their data where.
(And not all of our disks will be uniform. We'll likely put some specific storage on SSDs but most of it will be on good old inexpensive spinning rust.)
- In general we need to understand how the whole system works and why
it should perform well, survive explosions, and so on. 'And then
complex magic happens' makes us nervous and unhappy.
- Confidence that what we pick has a good chance of being around and working well in, say, ten years time. No one can guarantee anything, but turning over an entire fileserver environment is very painful and we want to at least have good confidence that we won't have to do it any time soon.
This is all abstract (and hopefully high level) because I'm trying to be open-mindedly generic rather than viewing everything through the limiting goggles of our current fileserver solution and its model of how the storage world can be structured.
I'm reluctant to consider 'source available' or even 'open source' as a strict requirement, but at this point it might be very hard to persuade us that a closed alternative was enough better to overcome the substantial disadvantage of not having source code. Real source code makes it much easier to understand and inspect systems and we've repeatedly found this to be very important. Similar things apply for monolithic single vendor solutions as compared to solutions built on open standards with replaceable components.
(I'm assuming that everything today will deliver basic features like fault and problem monitoring and the ability to be driven from a Unix command line environment.)