When you have fileservers, they naturally become the center of the world

December 27, 2017

Every so often I spend a little bit of time thinking about how we might make some use of cloud computing, generally without coming up with anything meaningful, and then inevitably I wind up thinking about what makes it hard for us. So today I want to mention a little downside of having fileservers, which is that once you have fileservers they can easily become the center of your computing universe and then everything becomes tied to the fileservers.

To make this concrete, let's look at IMAP. When you build an IMAP server, you have to decide where people's IMAP folders will be stored. One option is a storage system that is dedicated to the IMAP server (or servers) through various options, including locally attached disks or a dedicated little SAN. With a fileserver environment, another natural choice is on the fileservers along with all your other data; this is especially attractive if you're already managing space there on a per-user or per-group basis, so you don't have to allocate IMAP folder space to people or groups and you can have it just come out of their existing space.

Now suppose you want to move your IMAP service into a cloud. If you opted to store the IMAP folders 'locally' to the IMAP servers, you can move the whole assemblage into the cloud in a fairly straightforward way. But if you chose to store IMAP folders on your existing fileservers, the actual data the IMAP server uses is entangled with the rest of the data on the fileservers (perhaps hopelessly so). You can't really move the service as a whole to the cloud, and moving the servers alone is probably a bad idea for all sorts of reasons.

(It's not just IMAP for us, of course; there are all sorts of services that are entangled with our fileservers because the data they use lives on the fileservers. Our web server is another obvious example.)

At the same time, putting data on fileservers is not a bad thing; instead it's the completely natural thing. Holding and serving data is what they're there for and if we've done a competent job, they're quite good at that. Building, operating, backing up, monitoring, and managing space on a whole collection of little storage nodes is not the greatest idea in the world; it's redundant work and it adds all sorts of complications to everyone's life. And it's much easier for people if they can just get generic space that they can use for whatever they want, whether that be email messages, web data, home directories, data files for computations, or so on.

(In a sense, the entire reason you build general use fileservers is to make them the center of the computing universe. Well, at least in our somewhat unusual setup.)

Comments on this page:

I feel like "Cloud" is really just a server that someone else runs and is responsible for (possibly contractually with fangs). It's still a server which is likely located in someone else's data center.

In your example, it's possible to provide an IMAP server in the cloud, but as you say, enabling it to use the data stored on your existing file servers is … complicated.

Though arguably it's the IMAP server's responsibility to handle the data some how, magically if it wants to. Thus, the new cloud based IMAP server is responsible for the data, likely without connectivity to your file servers. This is certainly doable, albeit with some complications, like copying data from the old non-cloud IMAP server to the new cloud IMAP server. Fundamentally this is copying data from one (file) server to another.

I also have problem with the idea that cloud usually implies that it's in a remote data center, which demands connectivity between your user base and said cloud server. Implying that users will likely loose connectivity if said link goes down. Comparatively users will likely still be able to access a local non-cloud server even if the Internet link goes down.

Written on 27 December 2017.
« I continue to have strong confidence in ZFS On Linux
How our IMAP server wound up running out of inodes »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Dec 27 02:20:31 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.