Wandering Thoughts

2018-02-16

How I tend to label bad hardware

Every so often I wind up dealing with some piece of hardware that's bad, questionable, or apparently flaky. Hard disks are certainly the most common thing, but the most recent case was a 10G-T network card that didn't like coming up at 10G. For a long time I was sort of casual about how I handled these; generally I'd set them aside with at most a postit note or the like. As you might suspect, this didn't always work out so great.

These days I have mostly switched over to doing this better. We have a labelmaker (as everyone should), so any time I wind up with some piece of hardware I don't trust any more, I stick a label on it to mark it and say something about the issue. Labels that have to go on hardware can only be so big (unless I want to wrap the label all over whatever it is), so I don't try to put a full explanation; instead, my goal is to put enough information on the label so I can go find more information.

My current style of label looks broadly like this (and there's a flaw in this label):

volary 2018-02-12
no 10g problem

The three important elements are the name of the server the hardware came from (or was in when we ran into problems), the date, and some brief note about what the problem was. Given the date (and the machine) I can probably find more details in our email archives, and the remaining text hopefully jogs my memory and helps confirm that we've found the right thing in the archives.

As my co-workers gently pointed out, the specific extra text on this label is less than idea. I knew what it meant, but my co-workers could reasonably read it as 'no problem with 10G' instead of the intended meaning of 'no 10g link', ie the card wouldn't run a port at 10G when connected to our 10G switches. My takeaway is that it's always worth re-reading a planned label and asking myself if it could be misread.

A corollary to labeling bad hardware is that I should also label good hardware that I just happen to have sitting around. That way I can know right away that it's good (and perhaps why it's sitting around). The actual work of making a label and putting it on might also cause me to recycle the hardware into our pool of stuff, instead of leaving it sitting somewhere on my desk.

(This assumes that we're not deliberately holding the disks or whatever back in case we turn out to need them in their current state. For example, sometimes we pull servers out of service but don't immediately erase their disks, since we might need to bring them back.)

Many years ago I wrote about labeling bad disks that you pull out of servers. As demonstrated here, this seems to be a lesson that I keep learning over and over again, and then backsliding on for various reasons (mostly that it's a bit of extra work to make labels and stick them on, and sometimes it irrationally feels wasteful).

PS: I did eventually re-learn the lesson to label the disks in your machines. All of the disks in my current office workstation are visibly labeled so I can tell which is which without having to pull them out to check the model and serial number.

LabelingBadHardware written at 00:52:35; Add Comment

2018-02-11

Sending emails to your inbox is a dangerous default

I tweeted:

One of the things I have to keep learning over and over again about email is that I should not let so many things bother me by showing up in my inbox. Even relatively low-volume things.

(I can filter or I can eliminate the email, depending on the situation.)

It starts innocently enough. You start getting some new sort of email (perhaps you sign up for it, maybe it's an existing service sending new email, or perhaps it's a new type of notification that you've been auto-included in). It's low volume and reasonably important or useful or at least interesting. But it's a drip. Often it ramps up over time, and in any case there are a lot of sources of such drips so collectively they add up.

In the process of planning an entry about dealing with this, I've come to the obvious realization that one important part here is that new email almost always defaults to going to your inbox. When it goes to your inbox two things happen. First, it gets mixed up with everything else and you have to disentangle it any time you look at your inbox. Second, by default it interrupts you when it comes in. Sure, I may have some tricks to avoiding significant interruptions from new email, but it still partly interrupts me (I have to look at the subject at least), and unless I'm very busy there's always the temptation to read it right now just so that I can throw it away (or file it away).

(Avoiding that interruption in the first place is not an option for two reasons. First, part of my job as a sysadmin is to be interrupted by sufficiently important issues. Second, I genuinely want to read some email right away; it's important or I'm expecting it or I'm looking forward to it.)

It's certainly possible to move email so it doesn't wind up in my inbox, but as long as the default is for email to go to my inbox, stuff is going to keep creeping in. It's inevitable because people follow the path of least resistance; when it takes more work to filter things out (and requires a sample email and some guesses as to what to match on and so on), we don't always do that extra work.

(And that's the right tradeoff, too, at least some of the time. One email a year or even a month probably is not worth the time to set up a filter for. Maybe not even one email a week, depending.)

If email defaulted to not coming to my inbox and had to be filtered in, my email life would be a very different place. There are drawbacks to this, so in practice probably the easiest way to arrange it is to have different email accounts with different inboxes that have different degrees of priority (and that you check at different times and so on).

(Of course this is where my email mistake bites me in the rear. I don't have the separate email accounts that other people often do; I would have to set up new ones and shift things over. This is something I'll have to do someday, but I keep deferring it because of the various pains involved.)

PS: There are also practical drawbacks to shifting (some) email out of your inbox, in that unless you're very diligent it increases the odds that the email won't get dealt with because you just don't get around to looking at it. This is certainly happening with some of the email that I've moved out of my inbox; I'll get to it someday, probably, but not right now.

InboxDangerousDefault written at 23:07:37; Add Comment

2018-01-19

I'm one of those people who never log out from their desktop

Only crazy people log out from their desktop every time they step away from it for moderate amounts of time. Whether you're leaving to get lunch or to go to a long meeting, sensible people just lock the screen (something that I've deliberately made very easy in my X setup). But my impression is that a fair number of people log out at the end of the day, or at least the end of the week.

I'm not one of those people; not at home and especially not at work. With rare exceptions, I log in when I boot up my machine and then I stay logged in until I'm going to reboot it (and then I log right back in again). When I leave, whether for an evening, the weekend, or the university's multi-week winter break, I just lock my X session (which at least purges my SSH keys). As a sysadmin who cares about security to some degree, this can feel a bit embarrassing; it would probably be moderately more secure to actually log off my office machine every night and log in again every morning.

(At home there's less reason to worry about the security issues and I use my desktop every day.)

A large part of why I do this is simply that I'm lazy. Both locking and unlocking my screen are a lot faster than logging out (in an orderly way) and then starting up my X session all over again. While I've automated a fair amount of starting my X session, there's still a number of manual steps involved (for example, I start some programs by hand and manually place their windows). The whole thing is enough of a hassle that I don't feel inclined to do it more often than I really have to. It also takes a bit of time, for various reasons; even if everything magically started automatically, it would probably take sixty seconds or so until my desktop was all up and running.

(Logging out requires a bit of work because things like Firefox are much happier if I shut them down in an orderly way instead of just yanking the X session out from underneath them.)

A certain amount of this manual startup work is because I've added a few more 'always present' windows but haven't gotten around to adding them to my startup script. Some of them are a bit awkward to automate (because they are really 'start an xterm with a shell, then run a command in the shell'), but I could probably glue something together. Other programs have to be started by hand because they provide no way to specify things like where to place their windows or that they should start iconified (with their icon in a specific spot). Possibly I could arrange a sufficiently complicated set of supporting scripts to automate this (using things like wmctrl), but just not logging out is a lot easier.

Staying logged in all of the time has some interesting consequences. The obvious one is that I normally keep all of my regular X programs running continuously for days on end (and sometimes weeks). Unsurprisingly, programs do not always expect this or handle it perfectly. Even when a program doesn't have issues with running for a long time, it may do somewhat inconvenient things like only loading certain information at startup.

(Awkwardly, one of the programs I use with this 'only on startup' flaw is one that I wrote myself. My excuse is that it was by far the easiest way to code that particular feature, the data involved doesn't change often, and I can always restart the program if I need to. Still, I should probably fix this someday.)

StayingLoggedOn written at 19:11:58; Add Comment

2018-01-14

Our small tools for running commands on multiple machines

A while back I wrote about the personal shell scripts I had for running commands on multiple machines. At the time, they were only personal scripts that I used myself; however, over time they kept informally creeping into worklog entries that documented what we actually did and even some shell scripts we have to pre-write the commands we need for convoluted operations like migrating ZFS filesystems from server to server. Eventually we decided to adopt them as actual official scripts, put in our central location for such scripts.

My own versions were sort of slapped together, especially the machines script to print out the names of machines that fall into various categories, so making them into production-worthy tools meant cleaning that up. The oneach script needed only moderate reforms and as a result the new version is only slightly improved over my old personal version; in day to day usage, I probably couldn't notice any difference if I switched back to using my old one.

(The big difference is that the production version has more options for things like extra verbosity and a dryrun mode that just reports the ssh commands that would be run.)

The machines command got completely redone from scratch, because I realized that my hack approach just wouldn't work. For a start, I couldn't ask my co-workers to edit a script every time we added a machine; there would have been a revolt. So I wrote a new version in Python that parsed a configuration file. This new production version is a drastic improvement over my shell script hack; because I wrote it in Python, I was able to include significantly more features, in addition to making it more convenient and regular (since it's parsing a configuration file). The most important one is support for 'AND' and 'EXCEPT' operations, so you can express machine categories like 'all machines with some feature that are also Ubuntu 16.04 machines' or 'all Ubuntu 14.04 machines except ...'. This is supported both in the configuration file, where it sees a little bit of use, and on the command line, where I take advantage of it periodically.

(The configuration file format is nothing special and basically duplicates what I've seen other similar programs use. Although I didn't consciously set out to duplicate their approach, it feels like we wound up in the same spot because there's only so many good solutions for the problem.)

Using a configuration file doesn't just make things more convenient and maintainable; it also makes them more consistent, in several senses. It's now much harder for me to accidentally forget to add machines to categories they should be in (or not remove them from categories that no longer apply). A good part of the reason is that the configuration file is mostly inverted from how my script used to do it. Rather than list machines that are in categories, it mostly lists the categories that a machine is in:

apps0    apps  ubuntu1604  allnfs  users

There are a few categories that are explicitly specified, but even then they tend to be in terms of other categories:

all=ubuntu1604 ubuntu1404

This approach wouldn't have been feasible in my original simple shell script, but it's a natural one once you have a configuration file (especially if you want to make adding new machines easy and obvious; for the most part you can copy an existing line and change the initial host name).

In theory I could have done all of these improvements in my own personal versions, and writing the Python version of machines didn't take too long (even writing a Go version for my own use only added a modest amount of time). In practice it took the push of knowing that these had to now be generally usable and maintainable by my co-workers to get me to spend the time. Would it have been wrong to spend the time on this when they were just personal scripts? Probably, and even if not I doubt I could have persuaded myself of that. After all, they worked well enough as they were originally.

ToolsOneachII written at 02:15:11; Add Comment

2018-01-04

The goals and problems of our Dovecot IMAP configuration migration

We have a long standing backwards compatibility issue with our IMAP server, which is that we have it configured so that the root of the IMAP mail folder storage is people's $HOME. Originally this led to Dovecot session hangs, but now it's led to running out of inodes on the Dovecot server machine and general NFS load as people's Dovecot sessions rummage all through their home directories on our fileservers. Today I'm going to talk about our ideal IMAP configuration, the problems of trying to migrate to it, and then some thoughts on what we might settle for.

(In other words, our Dovecot configuration currently sets mail_location to 'mbox:%h:INBOX=/var/mail/%u'.)

If I could wave a magic wand, the Dovecot IMAP configuration we want is simply one where all mail folders are stored under some directory in people's home directories, say $HOME/mail, and IMAP clients wouldn't need or want an IMAP path prefix. In this world, if you had a mail folder for private stuff your client would know it as the PrivateStuff folder and it would be stored in $HOME/mail/PrivateStuff. Should your IMAP client do a 'LIST "" "*"' operation, Dovecot would only traverse through $HOME/mail and everything under it, not all of $HOME.

There are three problems with migrating from our current configuration to this setup. First, there's all of people's current mail folders that are in places outside of $HOME/mail, which must be moved into $HOME/mail in order to stay accessible via IMAP. Second, even for people who have their mail folders in $HOME/mail, their clients know it under a different IMAP path; right now an existing $HOME/mail/PrivateStuff would be known to the IMAP client as mail/PrivateStuff (even if the client hides this from you by having an IMAP path prefix), where in the new world it would be known as just PrivateStuff. Finally, some people have their clients set up with an IMAP path prefix, which would have to be removed to get our ideal setup (even if their IMAP path prefix is currently mail because they're storing everything under $HOME/mail right now).

There are some ways to improve these. First, if people are willing to accept extra directories under $HOME/mail, they can avoid needing any client changes even if they were already putting all of their mail folders in a subdirectory; all you do is preserve the subdirectory structure when you move mail folders around. If you currently have $HOME/Mail/PrivateStuff, move it to $HOME/mail/Mail/PrivateStuff instead of $HOME/mail/PrivateStuff. You wind up with the theoretically surplus $HOME/mail/Mail directory (and a 'Mail/' IMAP path component), but all your clients can continue on as they are.

Second, if a person uses IMAP subscriptions so that the server-stored subscription information knows all of their mail folders, we can reliably move all of them into the same hierarchy under $HOME/mail using only server-side information. Server side mail processing in things like .procmailrc may need to be updated, however, since the actual Unix paths will obviously change. Also, use of IMAP subscriptions (and the IMAP LSUB command) is far from universal among IMAP clients (as I've discovered). As far as I know, Dovecot doesn't provide a way to log information about what mail folders are SELECT'd, so we can't determine what actual mail folders exist through tracking client activity; it's server side IMAP subscriptions or nothing.

So far I've described the configuration that we want, not necessarily the configuration we're willing to settle for. So what are our actual minimum goals? While we haven't actively discussed this, I think what we'd settle for is an end state configuration where IMAP clients can't search through all of $HOME or store mail folders anywhere outside a small set of subdirectories under it. We could live with a configuration where mail folders could be in any of $HOME/mail, $HOME/IMAP, $HOME/Mail, and a few others. We can also live with these being visible in the IMAP mail folder names that clients use, so that instead of seeing a folder called PrivateStuff in your client, you either see mail/PrivateStuff (or Mail/PrivateStuff or so on), or you set an IMAP path prefix in your client to hide it.

Sam Hathaway's comment on my entry on IMAP paths in clients and servers brought Dovecot's namespaces to my attention, especially the backwards compatibility examples. I don't think these can be used to migrate towards our ideal configuration, but it's possible they could be used to create something like what we're willing to settle for.

(They could also be used to strip out prefixes from the IMAP paths that clients send us, but in our specific situation I don't think there's much point in doing that. The hard part is getting people's mail folders under $HOME/mail, and we don't really care if their path there winds up being eg $HOME/mail/Mail/<something>.)

Sidebar: A brief note on the mechanics of migration

For various reasons, we have no intention of operating a single Dovecot server with different configurations for different users (ie, with some migrated to the new, confined configuration and others using the old one). Instead we'd do the migration by building an entire new IMAP server under a new name with the new configuration, and then telling people what they had to do to switch over to using it. New people would be pointed to the new server (and blocked from using the old one), while existing people would be encouraged and perhaps helped to migrate. Eventually we'd be down to a few stubborn holdouts and then we'd give them no choice by turning the old server off.

Conveniently the current IMAP server is running Ubuntu 14.04, which means that it has a natural remaining lifetime of about a year and a quarter. This is perhaps enough time to actually get everyone migrated without too much pain.

(Then in a year or two more we'd quietly switch back to using the old IMAP server name, because it really is the best name for an IMAP server.)

IMAPMigrationGoalsProblems written at 00:20:56; Add Comment

2017-12-31

Understanding IMAP path prefixes in clients and servers

Suppose you have some IMAP clients and they talk to an IMAP server which stores mailboxes somewhere in the filesystem under people's home directories (let's call this the IMAP root for a user). One of the complications of talking about where people's mailboxes and folders actually wind up in this environment is that both the clients and the server get to contribute their two cents, but how they manifest is different.

(As a disclaimer, I'm probably abusing IMAP related terminology here in ways that aren't proper and that I'd fix if I actually ever read up on the details of the IMAP protocol and what it calls things.)

To start with, the IMAP protocol has the concept of a hierarchy of folders and mailboxes, rooted at /. This hierarchy is an abstract thing; it's how clients name things to the server (and how they traverse the namespace with operations like LIST and LSUB). The IMAP server may implement this hierarchical namespace however it wants, using whatever internal names for things that it wants to (provided that it can map back and forth between internal names and protocol level ones know by clients and named in the IMAP subscriptions and so on). Even when an IMAP server stores this IMAP protocol namespace in the filesystem, it may or may not use the client names for things. For now, let's assume that our IMAP server does.

Many IMAP clients have in their advanced configuration options an option for something like an 'IMAP Path Prefix' or an 'IMAP server directory', to use the names that iOS and Thunderbird respectively use for this. This is what it sort of sounds like; it basically causes the IMAP client to use this folder (or series of folders) as a prefix on all of the mailbox and folder names it uses, making it into the root of the IMAP namespace instead of /. If you set this in the client to IMail and have a mailbox that you call 'Private' in the client, the actual name of the mailbox in the IMAP protocol is IMail/Private. Your client simply puts the IMail on the front when it's talking to the server and takes it back off when it gets stuff back and presents this to you.

A client that has an IMAP path prefix and uses LIST will normally only ask for listings of things under its path prefix, because that's what you told it to do. What's visible under the true IMAP root is irrelevant to such a client; it will always confine itself to the path prefix. In our filesystem-backed IMAP server, this means that the client is voluntarily confining itself to a subdirectory of wherever the IMAP server stores things in the filesystem and it doesn't care (and won't notice) what's outside of that subdirectory.

On the server side, the IMAP server might be configured (as ours sadly is) to store folders and mailboxes straight under $HOME, or it might be configured to store them starting in a subdirectory, say $HOME/IMAP. This mapping from the IMAP protocol directory hierarchy used by clients to a directory tree somewhere in the filesystem is very much like how a HTTP server maps from URLs to filesystem locations under its document root (although in the case of the IMAP server, there is a different 'IMAP root' for every user). A properly implemented IMAP server doesn't allow clients to escape outside of this IMAP root through clever tricks like asking for '..', although it may be willing to follow symlinks in the filesystem that lead outside of it.

(As far as I know, such symlinks can't be created through the IMAP protocol, so they must be set up by outside means such as the user sshing in to the IMAP server machine and making a symlink by hand. Of course, with fileservers and shared home directories, that can be any of our Linux servers.)

Using an IMAP path prefix in your client is a good thing if the server's IMAP root is, say, $HOME, since there are probably a great many things there that aren't actually mailboxes and mail folders and that will only confuse your client (and complicate its listing of actual interesting mailboxes) if it looks at them by asking for a listing of /, the root of the IMAP namespace. With an IMAP path prefix configured, your client will always look at a subdirectory of $HOME where you'll presumably only have mailboxes and so on.

The IMAP server is basically oblivious to the use of a client side IMAP path prefix and can't exert any control over it. The client never explicitly tells the server 'I'm using this path prefix'; all the server sees is that the client only ever does operations on things with some prefix.

The net result of this is that you can't transparently replace the use of a client side IMAP path prefix with the equivalent server side change in where the IMAP root is. If you start out with a client IMAP path prefix of IMail and a server IMAP root of $HOME, and then change to a server IMAP root of $HOME/IMail, the client will still try to access IMail/Private, the server will translate this to $HOME/IMail/IMail/Private, and things will probably be sad. To make this work, either you need to move things at the Unix filesystem level or people have to change their IMAP clients to take out the IMAP path prefix.

To make this perhaps a little bit clearer, here is a table of the various pieces and the resulting Unix path that gets formed once all the bits have been put together.

Server IMAP root client IMAP prefix Client folder Unix path
$HOME <none> Private $HOME/Private
$HOME <none> IMail/Private $HOME/IMail/Private
$HOME IMail Private $HOME/IMail/Private
$HOME/IMail IMail Private $HOME/IMail/IMail/Private
$HOME/IMail <none> Private $HOME/IMail/Private

For a given server IMAP root, it doesn't matter whether the client forms the (sub)folder name explicitly or through use of a client IMAP path prefix. If you use multiple clients and only some of them are set up with your IMAP path prefix, clients configured with the prefix will see folder names with the prefix stripped off and other clients will see the full (IMAP protocol) folder path; this is the second and third lines of the table.

(If all of your clients respect IMAP subscriptions, the server may not be able to tell whether or not any particular one of them has an IMAP path prefix configured, or if it's just dutifully following the subscriptions (which are of course all inside the IMAP path prefix you have configured on some clients).)

(This is one of the entries I write partly to get all of this straight in my head.)

IMAPPrefixesClientAndServer written at 01:14:58; Add Comment

2017-12-28

How our IMAP server wound up running out of inodes

On Twitter, I mentioned that we'd run out of inodes on a server, and then a few weeks later I made a comment about an IMAP feature:

I'm coming to really dislike IMAP clients that don't use subscriptions, even though the consequences for our server are sort of our own fault.

These two tweets are very closely related, and there is a sad story here (since it's sort of our own fault).

In the IMAP protocol, there are two ways to get a list of mailboxes and folders that you have; the LIST command and the LSUB command. The difference between the two is that LSUB restricts itself to things that you have SUBSCRIBE'd to (another IMAP command), while the LIST command just lists, well, everything that the IMAP server can discover. When the IMAP server is backed by some sort of database, that 'what it can discover' comes from the database engine; when the IMAP server is storing things in the filesystem as a directory hierarchy, that just translates to a directory listing.

(For more details, see here and here.)

Many IMAP clients use IMAP subscriptions both to track what folders they know about and synchronize the list of known folders between clients, since your IMAP subscriptions are remembered by the server and stored there. However, some clients can't be bothered with this; they simply use LIST to ask the IMAP server for absolutely everything (and presumably then show some or all of it to you).

Even when your IMAP server is storing mailboxes and folders in the filesystem, the difference between LIST and LSUB is normally not particularly important because the IMAP server is normally using an area that's only for mailboxes, and the only thing normally found there is mailboxes. Then, unfortunately, there's us. Due to the ongoing requirements of backwards compatibility, the root of our IMAP server's mailbox storage is people's $HOME. It is quite possible for people's $HOME to contain a lot of things that aren't mailboxes and mail folders, at which point the difference between LIST and LSUB becomes very important to us. If a client uses IMAP subscriptions, what else is in $HOME doesn't matter; the client will only try to look through things you've subscribed to, which are presumably actually mailboxes (and limited). But if the client ignores IMAP subscriptions and just uses LIST, it winds up trying to look through everything, and then when it finds directories, it recurses down through them in turn.

A year and a half ago, our problem was runaway LIST searches that either ran into symlink cycles or escaped into the wider filesystem, hanging Dovecot and hammering our fileservers. That's basically stopped being a problem. Today's problem is that some people who use these clients have fairly large $HOMEs, with things like significant version-controlled source trees and datasets with lots of files and subdirectories. Dovecot maintains index files in a directory hierarchy for every mailbox and mail folder that it knows about; when a client uses LIST recursively, this translates to 'at least every directory that Dovecot runs across'. We have Dovecot store its indexes on the IMAP server's local mirrored system disks, because that's a lot faster than getting them over NFS.

This is how we wound up running out of inodes on our IMAP server. Dovecot was just trying to store too many index files and directories. Discarding people's index data didn't help for long, because of course their clients did it again and recreated it all after a few days.

(Our short term brute force solution was to put in a larger set of SSDs and create a partition just for Dovecot's index data, with the number of inodes set to the maximum value. This has managed to keep us out of danger so far.)

I suspect that clients doing this unrestricted LIST usage can't be giving the people using them a really good experience, but apparently it's not so terrible that people stop using them. Unfortunately we don't really have any ideas what specific clients are involved, partly because more and more people are using multiple clients across many different devices.

(Our long term fix is going to have to be migrating away from our backwards compatibility settings, but that's going to be a very slow process and probably a lot of work. Helpfully it can be done fairly easily for people who actually use IMAP subscriptions, but discussing the issues involved is for another entry.)

Sidebar: How many inodes we're talking about

At the moment, our most prolific user has over 1.3 million Dovecot index files and directories, with the next two most prolific users have over 730k and 600k respectively (fortunately it falls off fairly rapidly from there). The overall result of this is that our filesystem for storing this Dovecot index data has over 4.6 million inodes used.

IMAPServerInodeProblem written at 02:40:26; Add Comment

2017-12-27

When you have fileservers, they naturally become the center of the world

Every so often I spend a little bit of time thinking about how we might make some use of cloud computing, generally without coming up with anything meaningful, and then inevitably I wind up thinking about what makes it hard for us. So today I want to mention a little downside of having fileservers, which is that once you have fileservers they can easily become the center of your computing universe and then everything becomes tied to the fileservers.

To make this concrete, let's look at IMAP. When you build an IMAP server, you have to decide where people's IMAP folders will be stored. One option is a storage system that is dedicated to the IMAP server (or servers) through various options, including locally attached disks or a dedicated little SAN. With a fileserver environment, another natural choice is on the fileservers along with all your other data; this is especially attractive if you're already managing space there on a per-user or per-group basis, so you don't have to allocate IMAP folder space to people or groups and you can have it just come out of their existing space.

Now suppose you want to move your IMAP service into a cloud. If you opted to store the IMAP folders 'locally' to the IMAP servers, you can move the whole assemblage into the cloud in a fairly straightforward way. But if you chose to store IMAP folders on your existing fileservers, the actual data the IMAP server uses is entangled with the rest of the data on the fileservers (perhaps hopelessly so). You can't really move the service as a whole to the cloud, and moving the servers alone is probably a bad idea for all sorts of reasons.

(It's not just IMAP for us, of course; there are all sorts of services that are entangled with our fileservers because the data they use lives on the fileservers. Our web server is another obvious example.)

At the same time, putting data on fileservers is not a bad thing; instead it's the completely natural thing. Holding and serving data is what they're there for and if we've done a competent job, they're quite good at that. Building, operating, backing up, monitoring, and managing space on a whole collection of little storage nodes is not the greatest idea in the world; it's redundant work and it adds all sorts of complications to everyone's life. And it's much easier for people if they can just get generic space that they can use for whatever they want, whether that be email messages, web data, home directories, data files for computations, or so on.

(In a sense, the entire reason you build general use fileservers is to make them the center of the computing universe. Well, at least in our somewhat unusual setup.)

FileserversVsTheCloud written at 02:20:31; Add Comment

2017-12-22

Our next generation of fileservers will not use any sort of SAN

We've been using SAN-based fileservers here for a long time, partly for reasons that I once wrote about in Painless long term storage management without disturbing users. Our current and past generations of ZFS fileservers have been based around an iSCSI SAN, and before that we had at least one generation of Fibre Channel based fileservers using Solaris (with DiskSuite and relatively inexpensive hardware RAID-5 boxes. Some of the things we've wanted from a SAN haven't worked out lately but others have, and I wouldn't say we're unhappy with our current SAN setup.

We're in the process of putting together our next generation of fileservers and despite everything I just wrote, we've decided that they won't use a SAN. The core reason is that a SAN isn't necessary for us any more and moving away from having one both simplifies our life and means we need less hardware (which means everything costs less, which is an important consideration for us). It does matter that we want smaller fileservers and this affects the economics, but our decision goes beyond that; we have no regrets about the shift and don't feel we're being forced into it.

One not insignificant reason for this is that our ideas about long term storage management simply haven't worked out in practice (as I once theorized might happen). Even if we used iSCSI in our next generation, it was clear to us that the migration would once again involve copying all of the data with user-visible impact, just as it did the last time around. But beyond that, while I won't say that the iSCSI network has been useless, we haven't actually needed any of the advantages a SAN gives us in this generation. With solid hardware this time around, we haven't had a backend or a fileserver fail, or at least we've never had them fail for hardware reasons. Nor have we needed two iSCSI networks, as we've never had a switch or network failure.

Using iSCSI has unfortunately complicated our lives. It requires two extra networks and two extra sets of cabling, switches, and so on. It has to be monitored and software configurations have to be fiddled with, and we've actually had software issues because we have two iSCSI networks (every so often an OmniOS fileserver will refuse to use both iSCSI networks, especially after a reboot). And of course the split between fileservers and backends means more machines to look after.

(It also reduces the IO bandwidth we can get, which is an issue for various things including ZFS scrubs and resilvers, and means there's extra spots to monitor for performance impacts.)

A non-SAN fileserver environment is just going to be simpler, with fewer moving parts (in the sysadmin sense), and these days we can build it without needing to use anything that we consider chancy or unproven. Our existing iSCSI backends have provided us with the basic template; a server case with somewhere in the range of 16 to 24 disks and dual power supplies, a suitable motherboard, and connecting to all of the disks using some combination of SAS controllers and motherboard SAS and SATA ports (these days we no longer need to resort to chancy stuff like eSATA, the way we had to in our first generation). Using moderately sized servers with moderate amounts of disks goes well with our overall goals of smaller individual fileservers, and all of the pieces are well understood and generally work well (and are widely used, unlike eg iSCSI).

Will I miss having a SAN? My honest answer is that I won't. Like my co-workers, I'm looking forward to a simpler and more straightforward overall fileserver environment, with more isolation between fileservers and less to worry about and look at.

NoMoreSAN written at 02:17:35; Add Comment

2017-12-15

How we automate acmetool

Acmetool is my preferred client for Let's Encrypt and the one we've adopted for our switch to Let's Encrypt at work. If you know acmetool, talking about automating it sounds like a contradiction in terms, because the entire design of acmetool is about automating everything already; you put it in cron (or more exactly you let it put itself in cron as part of setup with 'acmetool quickstart'), and then you forget about it. Perhaps you have to write a hook script or two or adjust file permissions because a daemon runs as a different user, but that should be it.

However, there are a few questions that acmetool will ask you initially and there's one situation where it has to ask you a new question during certificate renewal, as was pointed out by a commentator on my earlier entry:

Recently Let's Encrypt switched over to a new version of their user agreement (v1.2). As a result, all certificate renewals for old accounts started failing (because they had only agreed to v1.1), and I had to ssh to all our servers, interactively run acmetool, and re-confirm the signup process (agreement & email) myself.

Fortunately you can automate this too, and you should. Acmetool supports a response file, which contains answers to questions that acmetool may try to ask you during either installation or certificate renewal. We automate these questions by preinstalling a responses file in /var/lib/acme/conf, which makes 'acmetool quickstart' work without having to ask us anything. When Let's Encrypt updated their user agreement, we pushed out a new version of the responses file that auto-accepted it and so returned certificate renewals to working without any manual action.

(The first renewal attempt after Let's Encrypt's update reported errors, then we worked out what the problem was, updated the file, pushed out a new version, and everything was happy. My personal websites avoided the problem entirely because of the timing; I had a chance to update my own responses file before any of their renewals came up, and when renewal time hit acmetool was fine.)

The responses settings we use are:

"acme-enter-email": "<redacted>@<redacted>"
"acmetool-quickstart-choose-server": https://acme-v01.api.letsencrypt.org/directory
"acmetool-quickstart-choose-method": webroot
"acmetool-quickstart-webroot-path": "/var/www/.well-known/acme-challenge"
"acmetool-quickstart-install-cronjob": true
# add an additional line to accept any new user agreement
"acmetool-quickstart-choose-server": https://acme-v01.api.letsencrypt.org/directory

As shown in the example responses file, you can set additional parameters like the normal key type, RSA key size, and so on. We haven't bothered doing this so far, but we may in the future.

You could vary the email address if you wanted to (for example for different classes of machines). We don't bother, because it's mostly unimportant; in practice, all it gets is the occasional email about one of our generic test machine hostnames that hasn't renewed its certificate because we haven't been using that hostname for anything that needed one.

AutomatingAcmetool written at 15:27:36; Add Comment

(Previous 10 or go back to December 2017 at 2017/12/09)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.