2008-09-25
Why qmail is no longer a suitable Internet MTA
Here's a statement that's going to get me disliked: qmail is no longer suitable as an Internet mail transport agent, especially not as an inbound MTA (something that receives email from the outside world). There are two reasons for this, the direct problem and then the deeper problem.
The direct problem is that a default, unpatched qmail setup handles unknown local addresses by accepting them at SMTP time and then bouncing them. This was okay when qmail was new a decade ago but it is no longer acceptable today; doing this makes qmail completely unsuitable as an inbound MTA unless you enjoy getting blacklisted and spamming innocent bystanders.
The deeper problem is why qmail continues to use 'accept then bounce', namely that qmail is effectively not maintained and on the Internet, unmaintained software rots. The reasons for this are complex (and political), but the simple summary is that for a long time qmail's license didn't permit distributing modified versions (just patches), and Dan Bernstein didn't seem to have any interest in modifying qmail.
While qmail has recently been released into the public domain and a version of it has started to be updated, I don't think that it solves either problem. It doesn't solve the accept then bounce problem because, well, the updated version still does accept-then-bounce, and it doesn't solve the lack of maintenance because by now, the people who would maintain qmail are those that have been self-selected to feel that it doesn't need much maintenance; the people who feel otherwise have long since been driven away by the lack of updates.
2008-09-21
Why I wind up writing real parsers for my sysadmin tools
There is a common habit in sysadmin tools
of using ad-hoc methods to extract information out of the less than
immediately helpful output of the vendor's programs. Bang together some
sed, some awk, some grep, and so on, and you can quickly get what
you need, generally in something that you can still understand once the
dust settles.
I do this for some tools, in some situations. But increasingly I am writing real parsers for things with complicated output. The problem is that an ad-hoc optimistic parser that just recognizes simple things and grabs output is too dangerous, because it makes an optimistic assumption: it assumes that anything it doesn't specifically recognize and pick out is unimportant.
When I am parsing complex output for really important things, I do not want to make this assumption. I want it to be the other way around; instead of assuming that anything I did not specifically code for is harmless and can be ignored, I assume that anything I do not recognize is dangerous and means that the parser should abort. At a minimum, the presence of unrecognized things means that I did not understand the output of what I'm parsing as well as I thought I did.
(I should note that this doesn't make my programs any better; in fact, it sometimes makes them worse, as they die on harmless things. But it makes me more confidant about what they're doing. Sysadmin tools definitely need to adhere to the 'first, do no harm' precept.)
As a consequence, all of my serious sysadmin tools lately have been
written in Python. While it's not impossible to write real parsers in
sed, awk, and so on, it's too painful and too much work to make me
interested.
(Yes, people have done amazingly impressive things in awk and sed,
but I'm lazy. Plus, I have more confidence in my ability to test Python
code.)
2008-09-17
How to securely manipulate user files
Here is a thesis:
The only way to securely do operations on user files is to do them as the user.
(You may also add 'correctly' to this.)
Over and over again I have seen root-run administrative programs and scripts try to manipulate user files in various ways, and over and over again I have seen them have problems and security holes. This is not because they were badly written, it's because there are a lot of race conditions lurking in the underbrush unless you are extraordinarily aware and careful.
The only genuinely reliable way out is to get rid of the entire problem. The problem is that an attacker is tricking you into manipulating the wrong files with special privileges, so get rid of the special privileges; when you manipulate files, do all of the manipulation as the user themselves. As a bonus you will get around any NFS root permission problems, where root actually has less privileges than the regular user does, not more.
(Please don't do this by temporarily switching the user's UID and then switching back, because that way you still have elevated privileges, even if they're latent.)
This is relatively easy in most shell scripts if you make yourself
a basic runas command that just setuids (thoroughly) to the user
and runs a command for you. The simple use of it is to just run every
command that touches user files as the user by putting 'runas $USER'
in front of the command. The more advanced usage is to split your shell
script or program into multiple scripts, and then use runas in your
main script to run the as-user script with appropriate arguments.
(You don't want to use su because it does too much; among other
things, it runs a shell, often the user's shell, and users can have
broken .bashrcs and .cshrcs. I've been there and stubbed my
toe.)
2008-09-11
Why I have the same shell dotfiles everywhere
In light of the complexity that is required to use the same set of shell dotfiles on all of the systems where I have accounts, one might sensibly ask why I bother. The answer is that I heavily customize my account's environment; I have all sorts of aliases, environment variable settings, and so on.
(Such heavy customization does have its downsides, including that I am not really comfortable on a new system until I have brought up my environment. A more subtle disadvantage for a sysadmin is that it is somewhat challenging for co-workers to temporarily take control of the keyboard when we are working together.)
While building generic dotfiles is somewhat annoying and time consuming, I find that it's worth it, because maintaining multiple copies of a heavily customized environment is a recipe for annoyance. Updates are a pain, which means that you skip propagating 'minor' changes to other systems or you are in a hurry one day so you only do a change on the current system because it's what really needs it. Inevitably your environments drift apart, which means that sooner or later something behaves differently than you expect (often on a little-used system).
(Having written that, I confess that I am no longer as faithful about this as I used to be, partly because I have stopped automatically pushing out updates from a master machine. But I still value the ideal, even if the practice doesn't quite live up to it.)
One occasionally amusing side effect of having been doing this for years
and having had accounts on all sorts of systems is a nice fossilized
layer of options and settings for now obsolete environments. (For
example, I am still carrying around settings for a KSR1. And a Cray,
which I see had versions of mv and cp that had no -i option.)
2008-09-06
Why negative DNS caching is necessary
DNS software in general has two forms of caching, which I've seen called 'positive' and 'negative'. Positive entries hold actual answers obtained from authoritative servers (theoretically, see Dan Kaminsky's DNS attack), while negative entries mark entries that (theoretically) don't exist. Positive entries are cached for their TTL value; negative entries don't have a TTL themselves, but more or less inherit a TTL from the zone's SOA record.
(The details are complicated.)
Negative caching matters because it creates yet another block on rapidly updating your zone. Even if you control all of the primary and secondary nameservers and can update them on command, you may need to wait the negative cache TTL duration before you can be sure that everyone can see a newly created DNS name. (This is most likely to happen if somehow the name has accidentally been published before you've created it, so that people have started doing queries for it.)
One might reasonably ask why negative caching is important. The short answer is 'domain search paths'; many systems (okay, at least many Unix systems) can be configured so that they look up simple hostnames in more than one DNS domain. The existence of search paths means that you can make a lot of queries for names that don't exist, as you look up the hostname in each of your search domains until you finally find the one it's in (or you fall off the end and do a rooted DNS query).
(Negative caching is also important when you're using a DNS blocklist, because hopefully most of your queries are for things that aren't listed.)