2009-02-25
Don't log usernames for bad logins
This used to be widely understood around Unix, but it's evidently slipped from common knowledge over time:
It's a mistake to log nonexistent usernames on bad logins.
(Corollary: it is an especially bad mistake to log them by default.)
To illustrate why I say this, let me tell you what happened to me recently. I normally leave myself logged in to my office workstation with the screen locked and blanked. This weekend, my office workstation crashed and rebooted because of some ongoing issues; as I don't use GDM et al, it sat there at a console login, and of course it blanked the screen. So on Monday morning I came in, saw a blank screen, and automatically did what I usually do: I tapped the shift key to wake up the X server and typed in my password to unlock the screen. I didn't bother stopping to look at the screen; I generally type faster than the screen wakes up, and this is all a well honed reflex anyways.
Of course, I wasn't typing my password to the X screen saver, I was typing my 'login name' to the login program. Which, on Fedora, logs unknown login names to syslog. Cue cursing, shooting of syslogd, and hand-editing syslog files. (And a reboot.)
(I pause here to be very thankful that my workstation is not hooked in to our central syslog server.)
The problem with logging nonexistent usernames is that sooner or later you will inevitably log someone's password in plain text. They will be operating on reflex as I was, or they will have focused the wrong widget, or typed ahead except the typeahead got eaten, or didn't notice that the system wasn't prompting them for what they thought it was, and boom, there goes their password. At the best, they have to change it. At the worst, they have no idea that it got logged and should be changed immediately, and will instead happily keep on using it.
(The other reason not to do this is that you don't really care about login attempts to nonexistent users except in very rare cases. In fact, you don't really care about login attempts to real users, because if you run an Internet-exposed machine you can safely assume that people are trying to log in to all of your accounts all of the time.)
This applies to more than Unix logins; it applies to anything that asks for a login, web services included. Perhaps especially web services.
Sidebar: on well honed reflexes
Due to my reflexive locking of X sessions, my reflex of typing my password to blank screens is so well developed that it is dangerous for me to let my display blank through inactivity. Even if I consciously know that I didn't lock the display, my automatic reaction can kick in before I stop myself. And if I don't remember, well, it's pretty hopeless.
Partly as a consequence of this, I have developed somewhat of a habit of not leaving focus on any X window that will accept keyboard input. If I'm not actively typing at something, I'll deliberately focus away from my xterms and so on. (If nothing else, this is a good way to avoid tragic consequences if I accidentally brush the keyboard. It's especially good if you're a vi user.)
2009-02-11
True point in time restores may be hard
One of the things that people ask for from their backup system, often for sensible reasons, is the ability to recreate the system exactly how it was at some past time; these are usually called 'point in time' restores. Point in time restores for the recent past are usually reasonably easy, but doing it for significant times ago can be very difficult.
The problem is the dependencies involved start to grow and grow. For example, if you want to be able to restore your accounting system to the exact state it was in four years ago, you don't just need a full set of the data from four years ago; you're also probably going to need the exact version of software that you were using four years ago, the same operating system version you were running back then, and then old hardware to run it on (because the four year old OS probably doesn't run on modern hardware because it doesn't have the necessary drivers for things like, say, SATA disks).
Once you have all of that, you may still need new license keys, or at least license keys that haven't expired. In a world that increasingly uses digital signatures, you may also need new un-expired SSL keys, newly signed code for applets, and so on. (You can try turning back the clock to four years ago, but then your old system may not interact very well with the rest of your network.)
Now, this is an extreme and possibly artificial example. But it illustrates the important issue: data has dependencies. If you need to be able to deal with that data the same way you did in the past, you need the data's dependencies or something that is a good enough substitute for them. And those dependencies have other dependencies, and so on.
(This is one of the problems with reliable archives.)
Fortunately, this isn't always the sysadmin's problem. Reasonably often we're just tasked with being able to restore the raw data exactly as it was at some point in time, and interpreting it correctly is (in theory) someone else's concern.
2009-02-09
Backups and archives
When I am thinking of backup issues, I tend to strongly separate backups from archives, and in fact I believe that there are two different sorts of backups with different characteristics. Since I think that this is an important issue, let me explain my distinctions.
To start with, to me the difference between backups and archives is that backups are for recovering after a system failure, while archives are for storing data that you no longer keep online. Here, you have to consider 'system failure' somewhat broadly, so that it includes not just the disks melting down but also things like users (or automated programs, or sysadmin error) accidentally removing or damaging files.
Within backups, there are two sorts of backups: short term backups and long term backups. A short term backup is one that you can pretty much count on restoring on to exactly the same sort of system, the same system environment, that you have now. A long term backup is a backup where you cannot count on this; you are keeping the backup for long enough that you may have done things like upgraded operating system versions and cannot now build an 'old' system just to restore some data.
If you have short term backups, it's perfectly acceptable to use a
convenient data format that is very system specific (for example, ZFS's
zfs send). But if you have long term backups, you run into potential
issues with such things (for example, zfs send is explicitly not
guaranteed to work across different Solaris versions), and you need a
more portable backup format. How much more portable depends on how much
of a radical change you expect, which also depends on how far back you
need to be able to go. If you need to go far enough back, you get into
archiving territory with all of the headaches that that implies.
(The other thing you need for long term backups is to keep track of how things got relocated and shuffled around in your filesystems, so that you can easily find out where they were in the past in order to do restores. For example, can you easily recover where a user's home directory or a particular database backup was six months ago?)
The archive versus backup distinction is especially important because true archiving is very hard, fundamentally because archiving has very ambitious goals and represents a total commitment (because the data exists only in the archives). Backups are much simpler because they have much smaller goals and generally a much shorter time span, so they do not have to be anywhere as durable (both in media and in the capability to do something useful with the media).
It is my personal belief that archiving is different enough from backups that you should not try to do real archiving with the same system that you use for backups. But if you actually want to use your backups as archives too, you must explicitly design for this and consider the archiving problems. You cannot just assume that your backup format, approach, or system makes for a good archival system, especially as you want to go further and further back in time.
Sidebar: why you want multiple backups
As a side note, somewhat in reaction to something I read recently, I do not feel that backups turn into archives just because you have more than one of them. You should make and keep multiple backups because you want insurance against all of the different sorts of backup failures, which include media failures, backup software failures, and failures to notice damage immediately.
Archives need insurance too, but it takes different forms: you make multiple copies of the archives, and you may do so on different sorts of media or in different formats in case one form turns out to be less durable than expected.
(There are people who make multiple copies of individual backups, but they tend to be either exceptionally paranoid or working with exceptionally high-value data.)
2009-02-07
An illustration of one reason that documentation is hard
My need to write an entry on how to force the outgoing interface on Linux handily illustrates the problem of writing documentation when one is too close to the problem (and hence the need to test documentation).
At the time of the original entry, it was probably obvious to me that you had to combine the firewall marking approach with SNAT, so I didn't bother explaining this explicitly. Indeed, reading carefully I can see that I more or less said that in the entry, although not clearly enough for me to see it later when I reread the entry because I needed to solve the same problem again.
(A certain number of WanderingThoughts entries are in large part notes to myself for future use.)
This is a difficult problem, since it's not just an issue of adding more details and being more explicit. Not only do you wind up belabouring the obvious sooner or later, but writing is not free, so more details means less writing elsewhere; you have to draw the line somewhere. (Plus, writing out what you feel to be painfully obvious is not fun at all.)
The thing that this makes me especially twitchy about is lab notebooks. Unlike other documentation they're written more or less explicitly for yourself and never tested on other people, and I tend to write them in a fairly terse style where I wind up assuming great deal of context (partly because one goal, at least for me, is to write them fast so that I will write them at all). Rereading some of my notes after the fact has been a little bit alarming, and I've recently found myself consciously backing up to add more details and context to something 'obvious' that I scribbled down.