2014-06-05
SMTP's crazy address formats didn't come from nowhere
Broadly speaking, SMTP addresses have two crazy things in them: route addresses and quoted local parts. Route addresses theoretically give you a way of specifying a chain of steps the message is supposed to take on its way to (or from) its eventual destination:
RCPT TO:<@a.ex.org,@barney:user@fred.dibney>
Quoted local parts allow you to use any random characters and character sequences in the local mailbox name:
MAIL FROM:<"abney <abdef> ...%barney"@example.org>
(As I grumbled about yesterday, quoted local addresses drastically increase the complexity of parsing modern SMTP commands.)
Here is the thing: these two features of SMTP addresses did not come from nowhere. When the very first SMTP RFCs were written, these features were necessary. Really.
Quoted local mailbox names have an obvious rationale: they accommodate systems that have local logins (or mailbox names) that do not fit into the simple allowable format that you can use without quoting. The obvious big case that needs this is any local mailbox with a space in the name. Today we don't do that (we tend to use dots), but I'm sure there were systems on the original ARPANet where people had mailbox names of 'Jane Smith' (instead of the Jane.Smith that we'd insist on today). I believe that one of the reasons for this is that people did not want to require a conversion layer in mailers between the true mailbox names (with spaces and funny characters) and the external, RFC-approved mailbox names that could be used in email.
(I can see at least one sensible reason for this: the less software that had to be written to get a system hooked up to ARPANet SMTP, the more likely that it would be and thus that ARPANet SMTP would actually get widely used.)
Equally, route addresses make a lot of sense in an environment where many systems are not directly on the ARPANet and no one has yet built the whole infrastructure of forwarding MTAs, internal versus external mail remapping, and indirect addressing in the form of MX entries. After all, the early SMTP RFCs predate DNS. Here the SMTP RFC is providing a way to directly express multi-hop mail forwarding, something that was a reality on the early ARPANet.
(SMTP route addresses were not the only form this took, of course.
The '% hack' used to be very common, where 'a%b@c' implied that
c would actually send the message on to a@b. And there were
even more complicated fudges for more complex situations.)
Internet email and Internet email addresses are such a juggernaut today that it is easy to forget that once upon a time the world was smaller and SMTP mail was a scrappy upstart proposing a novel and unproven idea, one that had to interoperate with any number of existing systems if it wanted to have any chance of success.
(Note here that I'm talking exclusively of SMTP addresses, not the more complex soup that is how addresses appear in the headers of email messages.)
2014-06-04
Why I don't like SMTP command parameters
Modern versions of SMTP have added something called 'command
parameters'. These extend the MAIL FROM and RCPT TO commands
to add optional parameters to communicate, for example, the rough
size of a message that is about to be sent (that's RFC 1870). On the surface these appear
perfectly sensible and innocent:
MAIL FROM:<some@address.dom> SIZE=99999
That is, the parameters are tacked on as 'NAME=VALUE' pairs after
the address in the MAIL FROM or RCPT TO. Unfortunately this
innocent picture starts falling apart once you look at it closely
because RFC 5321 addresses
are crawling horrors of complexity.
From the example I gave you might think that parsing your MAIL FROM
line is simple; just look for the first space and everything after it
is parameters. Except that the local name of addresses can be quoted,
and when quoted it can contain spaces:
MAIL FROM:<"some person"@a.dom> SIZE=99999
Fine, you say, we'll look for '> '. Guess what quoted parts can
also contain?
MAIL FROM:<"some> person"@a.dom> SIZE=99999
Okay, you say, we'll look for the rightmost '> ' in the message.
Surely that will do the trick?
MAIL FROM:<person@a.dom> SIZE=99999> BODY=8BITMIME
This is a MAIL FROM line with a perfectly valid address and then
a (maliciously) mangled SIZE parameter. You're probably going to
reject this client command, but are you going to reject it for the
right reason?
What the authors of RFC 5321 have created is a situation where you must do at least a basic parsing of the internal structure of the address just to find out where it ends. Especially in the face of potentially mangled input there is no simple way of determining the end of the address and the start of parameters, despite appearances. Yet the situation looks deceptively simple and a naive parser will work almost all of the time (even quoted local parts are rare, much less ones with wacky characters in them, and my final example is extremely perverse).
I'm sure this was not exactly deliberate on the part of the RFC authors, because after all they're dealing with decades of complex history involving all sorts of baroque possible addressing. From its beginning SMTP was complicated by backwards compatibility requirements and could not, eg, dictate that local mailboxes had to fit into certain restrictions. I'm sure that current RFC authors would like to have thrown all of this away and gone for simple addresses with no quoted local parts and so on. They just couldn't get away with it.
There is a moral in here somewhere but right now I'm too grumpy to come up with one.
(For more background on the various SMTP extensions, see eg the Wikipedia entry.)
PS: note that a semi-naive algorithm may also misinterpret 'MAIL
FROM<a@b> SIZE=999>'. After all, it has a '>' right there as the
last character.
2014-05-25
Computing has two versions of 'necessary'
In various fields of computing we often wind up either saying that something is necessary or arguing about whether it is necessary. One of the things that complicates these discussions is that in computing we have two versions or meanings of 'necessary', the mathematic and the pragmatic.
The mathematical version of necessary is minimalism. At its strongest, the mathematical 'necessary' means that this feature or thing is essential, that you have to have it, that things do not work without it. The pragmatic version of necessary is what I'll call economy of effort, the idea that something is necessary when it is the best way to achieve something. The pragmatic version of necessary is a humanist vision.
A mathematically necessary feature can also be pragmatically necessary; it is great when this happens because both sides get to agree. However it's common for pragmatically necessary things to not be mathematically necessary (at which point they often get called unneeded) and sometimes for the mathematically necessary things to not be pragmatically necessary (at which point they can get called too low-level).
A strong adherence to the mathematical version of necessary drives a lot of what I consider pathologies in computing. But a strong adherence to the pragmatic version of necessary also has its downsides, including clutter and incoherence when carried to extremes (which it often has been). And in general adherents of each side not infrequently wind up talking past each other.
PS: I suspect that you can come up with some examples of the mathematical necessary and the pragmatic necessary on your own, so I'm not going to fan the flames of argument by picking out ones here. There are some very obvious suspects among, eg, computer languages.
(I've touched on this idea before back here, among other entries.)
2014-05-05
The power of meaningless identifiers
In computing we have a strong tendency to create meaningful identifiers for things. There are any number of sensible reasons for this and any number of things you get from using meaningful identifiers (including that people can remember them), but there is a pair of drawbacks to it. The large one is that almost anything with meaning can have that meaning change over time; the smaller one is that almost anything with meaning can be misspelled or mistyped when initially created. Both of these lead you to really want to change those identifiers, which often leads to heartburn since many things don't cope very well with theoretically constant identifiers changing.
The power of meaningless identifiers is that this doesn't happen. Precisely because they don't mean anything, they never have to change and it doesn't matter if you made a mistake in their initial creation. This means that they can be genuinely constant things (and it's easy to keep promises about this).
This conflict between meaningful identifiers and constant identifiers and the power of meaningless identifiers to cut through the Gordian knot comes up repeatedly in different contexts. You have the cool URL problem, the login problem, the temptation to use real database fields as foreign keys, and many more. Using meaningless identifiers instead is often either completely problem free or at most slightly more ugly (when the meaningless identifier must be visible to people, for example in URLs).
Note that a truly meaningless identifier shouldn't just merely be empty of planned meaning, it should be structured so that no meaning can be read into it later no matter what its value is, whether this is by users or possible third parties. Random numbers are better than sequential or semi-sequential numbers, for example.
(And while we're at it, remember that 'randomly generated' doesn't necessarily give you things without meaning.)
2014-04-20
A heresy about memorable passwords
In the wake of Heartbleed, we've been writing some password guidelines at work. A large part of the discussion in them is about how to create memorable passwords. In the process of all of this, I realized that I have a heresy about memorable passwords. I'll put this way:
Memorability is unimportant for any password you use all the time, because you're going to memorize it no matter what it is.
I will tell you a secret: I don't know what my Unix passwords are. Oh, I can type them and I do so often, but I don't know exactly what they are any more. If for some reason I had to recover what one of them was in order to write it out, the fastest way to do so would be to sit down in front of a computer and type it in. Give me just a pen and paper and I'm not sure I could actually do it. My fingers and reflexes know them far better better than my conscious mind.
If you pick a new password based purely at random with absolutely no scheme involved, you'll probably have to write it down on a piece of paper and keep referring to that piece of paper for a while, perhaps a week or so. After the week I'm pretty confidant that you'll be able to shred the piece of paper without any risk at all, except perhaps if you go on vacation for a month and have it fall out of your mind. Even then I wouldn't be surprised if you could type it by reflex when you come back. The truth is that people are very good at pushing repetitive things down into reflex actions, things that we do automatically without much conscious thought. My guess is that short, simple things can remain in conscious memory (this is at least my experience with some things I deal with); longer and more complex things, like a ten character password that involves your hands flying all over the keyboard, those go down into reflexes.
Thus, where memorable passwords really matter is not passwords you use frequently but passwords you use infrequently (and which you're not so worried about that you've seared into your mind anyways).
(Of course, in the real world people may not type their important passwords very often. I try not to think about that very often.)
PS: This neglects threat models entirely, which is a giant morass. But for what it's worth I think we still need to worry about password guessing attacks and so reasonably complex passwords are worth it.
2014-04-18
What modern filesystems need from volume management
One of the things said about modern filesystems like btrfs and ZFS is that their volume management functionality is a layering violation; this view holds that filesystems should stick to filesystem stuff and volume managers should stick to that. For the moment let's not open that can of worms and just talk about what (theoretical) modern filesystems need from an underlying volume management layer.
Arguably the crucial defining aspect of modern filesystems like ZFS and btrfs is a focus on resilience against disk problems. A modern filesystem no longer trusts disks not to have silent errors; instead it checksums everything so that it can at least detect data faults and it often tries to create some internal resilience by duplicating metadata or at least spreading it around (copy on write is also common, partly because it gives resilience a boost).
In order to make checksums useful for healing data instead of just simply detecting when it's been corrupted, a modern filesystem needs an additional operation from any underlying volume management layer. Since the filesystem can actually identify the correct block from a number of copies, it needs to be able to get all copies or variations of a set of data blocks from the underlying volume manager (and then be able to tell the volume manager which is the correct copy). In mirroring this is straightforward; in RAID 5 and RAID 6 it gets a little more complex. This 'all variants' operation will be used both during regular reads if a corrupt block is detected and during a full verification check where the filesystem will deliberately read every copy to check that they're all intact.
(I'm not sure what the right primitive operation here should be for RAID 5 and RAID 6. On RAID 5 you basically need the ability to try all possible reconstructions of a stripe in order to see which one generates the correct block checksum. Things get even more convoluted if the filesystem level block that you're checksumming spans multiple stripes.)
Modern filesystems generally also want some way of saying 'put A and B on different devices or redundancy clusters' in situations where they're dealing with stripes of things. This enables them to create multiple copies of (important) metadata on different devices for even more protection against read errors. This is not as crucial if the volume manager is already providing redundancy.
This level of volume manager support is a minimum level, as it still leaves a modern filesystem with the RAID-5+ rewrite hole and a potentially inefficient resynchronization process. But it gets you the really important stuff, namely redundancy that will actually help you against disk corruption.
2014-04-11
The relationship between SSH, SSL, and the Heartbleed bug
I will lead with the summary: since the Heartbleed bug is a bug in OpenSSL's implementation of a part of the TLS protocol, no version or implementation of SSH is affected by Heartbleed because the SSH protocol is not built on top of TLS.
So, there's four things involved here:
- SSL aka TLS
is the underlying network encryption protocol used for HTTPS and
a bunch of other SSL/TLS things. Heartbleed
is an error in implementing the 'TLS heartbeat' protocol extension
to the TLS protocol. A number of other secure protocols are built
partially or completely on top of TLS, such as OpenVPN.
- SSH is the protocol
used for, well, SSH connections. It's completely separate from
TLS and is not layered on top of it in any way. However, TLS and
SSH both use a common set of cryptography primitives such as
Diffie-Hellman key exchange, AES, and
SHA1.
(Anyone sane who's designing a secure protocol reuses these primitives instead of trying to invent their own.)
- OpenSSL is an implementation of SSL/TLS in the form of a large
cryptography library. It also exports a whole bunch of functions
and so on that do various cryptography primitives and other
lower-level operations that are useful for things doing cryptography
in general.
- OpenSSH is one implementation of the SSH protocol. It uses various functions exported by OpenSSL for a lot of cryptography related things such as generating randomness, but it doesn't use the SSL/TLS portions of OpenSSL because SSH (the protocol) doesn't involve TLS (the protocol).
Low level flaws in OpenSSL such as Debian breaking its randomness can affect OpenSSH when OpenSSH uses something that's affected by the low level flaw. In the case of the Debian issue, OpenSSH gets its random numbers from OpenSSL and so was affected in a number of ways.
High level flaws in OpenSSL's implementation of TLS itself will never affect OpenSSH because OpenSSH simply doesn't use those bits of OpenSSL. For instance, if OpenSSL turns out to have an SSL certificate verification bug (which happened recently with other SSL implementations) it won't affect OpenSSH's SSH user and host key verification.
As a corollary, OpenSSH (and all SSH implementations) aren't directly affected by TLS protocol attacks such as BEAST or Lucky Thirteen, although people may be able to develop similar attacks against SSH using the same general principles.
2014-03-26
Why people keep creating new package managers
Matt Simmons recently wrote Just what we need ... another package manager, in which he has an unhappy reaction to yet another language introducing yet another package manager. As a sysadmin I've long agreed with him for all sorts of reasons. Packaging and managing language 'packages' is an ongoing problem and in our environment it also causes user heartburn when we have to turn down requests to install language packages through the language's own mechanisms.
(We have a strong policy that we only install official distribution packages in order to keep our own sanity. This works for us but not necessarily for other people.)
But at the same time I have a lot of sympathy for the language people. Let's look at the problem from their perspective. Languages need:
- package management everywhere they run, possibly including Windows
(which has no native package management system) and almost certainly
including the Macs that many developers will be using (which also
lack a native packaging system).
- something which doesn't force package contributors to learn more
than one packaging system, because most people won't and languages
want a healthy, thriving ecology of public packages. Ideally the
one packaging system will be a simple, lightweight, and low
friction one in order to encourage people to make and publish
packages.
- for developers of language packages to not have to deal with the
goat rodeos and political minefields that are the various
distribution packaging processes, because making them do so
is a great way of losing developers and not having packages.
- some relatively handy way to install and update packages that are not
in the official distribution repositories. No language can really
tolerate having its package availability being held hostage to
the whims of distributions because it basically guarantees an out
of date and limited package selection.
(The interests of languages, developers of language packages, and distributions are basically all at odds here.)
- support for installing packages in non-default, non-system locations, ideally on both a 'per developer' and a 'per encapsulated environment' basis.
From the language's perspective it would be nice if package management for the language could optionally be done the same way regardless of what host you're on. In other words, developers should be able to use the same commands to install and set up packages on their development Macs as they do on testing VMs running some Linux distribution (or even FreeBSD), and possibly also on the production systems.
(In the modern lightweight world many small companies will not have actual sysadmins and developers will be setting up the production machines too for a while. Sysadmins do not like this but it is a reality. And languages are not designed for or used by sysadmins, they are mostly designed for developers, so it is not surprising that they are driven by the needs of developers.)
It's theoretically possible for a language's package system to meet all of these needs while still enabling distribution packages and doing as much as possible with the core distribution packaging system, either explicitly or behind convenient cross-platform cover scripts. However there are plenty of cases (like non-system installation of packages) that are simply not handled by the distribution packaging system and beyond that there are significant difficulties on both the technical and political levels. It is simply much easier for a language to roll its own packaging system.
(Ideally it will support creating distribution packages and using the distribution packaging mechanisms as well. Don't hold your breath; it's not going to be a language priority.)
2014-03-21
Thinking about when rsync's incremental mode doesn't help
I mentioned recently that I had
seen cases where rsync's incremental mode didn't speed it up to any
significant degree. Of course there's an obvious way to create such a
situation, namely erasing and replacing all of the files involved, but
that wasn't it for us. Our case was more subtle and it's taken me a
while to understand why it happened. Ultimately it comes down to having
a subtly wrong mental model of what takes time in rsync.
Our specific situation was replicating a mail spool from one machine to
another. There were any number of medium and large inboxes on the mail
spool, but for the most part they were just getting new messages; as
far as we know no one did some major inbox reorganization that would
have changed all of their inbox. Naively you'd think that an rsync
incremental transfer here could go significantly faster than a full
copy; after all, most of what you need to transfer is just the new
messages added to the end of most mailboxes.
What I'm quietly overlooking here is the cost of finding out what
needs to be transferred, and in turn the reason for this is that I've
implicitly assumed that sending things over the network is (very)
expensive in comparison to reading them off the disk. This is an easy
bias to pick up when you work with rsync, because rsync's entire
purpose is optimizing network transmission and when you use it you
normally don't really think about how it's finding out the differences.
What's going on in our situation is that when rsync sees a changed
file it has to read the entire file and compute block checksums (on
both sides). It doesn't matter if you've just appended one new email
message to a 100 Mbyte file for a measly 5 Kbyte addition at the end;
rsync still has to read it all. If you have a bunch of midsized
to large files (especially if they're fragmented, as mail inboxes
often are), simply reading through all of the changed files can take a
significant amount of time.
In a way this is a variant of Amdahl's law. With a lot of slightly
changed files an rsync incremental transfer may speed up the network
IO and reduce it to nearly nothing but it can't do much about the
disk IO. Reading lots of data is reading lots of data, whether or
not you send it over the network; you only get a big win out of not
sending it over the network if the network is slow compared to the
disk IO. The closer disk and network IO speeds are to each other,
the less you can save here (and the more that disk IO speeds will
determine the minimum time that an rsync can possibly take).
The corollary is that where you really save is by doing less disk IO as
well as less network IO. This is where and why things like ZFS snapshots
and incremental 'zfs send' can win big, because to a large extent they
have very efficient ways of knowing the differences that need to be
sent.
PS: I'm also making another assumption, namely that CPU usage is free and is not a limiting factor. This is probably true for rsync checksum calculations on modern server hardware, but you never know (and our case was actually on really old SPARC hardware so it might actually have been a limiting factor).
2014-03-08
Why I think 10G-T will be the dominant form of 10G Ethernet
Today there are basically two options for 10G Ethernet products and interfaces, 10G-T (standard Ethernet ports and relatively normal Ethernet cables) and SFP+ (pluggable modules mostly using fiber). Historically SFP+-based products have been the dominant ones and some places have very large deployments of them while 10G-T seems to only have started becoming readily available recently. Despite this I believe that 10G-T is going to be the winning 10G format. There are two major 10G-T advantages that I think are going to drive this.
The first advantage is that 10G-T ports are simpler, smaller, and cheaper (at least potentially). SFP+ ports intrinsically require additional physical modules with their own circuitry plus a mechanical and electronic assembly to plug them into. This adds cost and it also adds physical space (especially depth) over what an Ethernet RJ45 connector and its circuitry require. In addition 10G-T is pretty much just an RJ45 connector and a chipset, and the hardware world is very good at driving down the price of chipsets over time. SFP+s do not have this simplicity and as such I don't think they can tap quite this price reduction power.
The second advantage is that 10G-T ports are backwards compatible with slower Ethernet while SFP+ ports talk only with other SFP+ ports. The really important aspect of this is that it's safe for manufacturers to replace 1G Ethernet ports with 10G-T Ethernet ports on servers (and on switches, for that matter). You can then buy such a 10G-T equipped server and drop it into your existing 1G infrastructure without any hassle. The same is not true if the manufacturer replaced 1G ports with SFP+ ports; suddenly you would need SFP+ modules (and cables) and a bunch of SFP+ switch ports that you probably don't have right now.
In short going from 1G to 10G-T is no big deal while going from 1G to SFP+ is a big, serious commitment where a bunch of things change.
This matters because server makers and their customers (ie, us) like 'no big deal' shifts but are very reluctant to make big serious commitments. That 10G-T is no big deal means that server makers can shift to offering it and people can shift to buying it. This drives a virtuous circle where more volume drives down the cost of 10G-T chipsets and hardware, which puts them in more places, which drives adoption of 10G-T as more and more equipment is 10G-T capable and so on and so forth. This is exactly the shift that I think will drive 10G-T to dominance.
I don't expect 10G-T to become dominant by replacing existing or future enterprise SFP+ deployments. I expect 10G-T to become dominant by replacing everyone's existing 1G deployments and eventually becoming as common as 1G is today. Enterprises are big, but the real volume is outside of them.
By the way: this is not a theoretical pattern. This is exactly the adoption shift that I got to watch with 1G Ethernet. Servers started shipping with some or all 1G ports instead of 100M ports, this drove demand for 1G switch ports, then switches started getting more and more 1G ports, and eventually we reached the point we're at today where random cheap hardware probably has a 1G port because why not; volume has driven the extra chipset cost to basically nothing.
Update: The reddit discussion of this entry has a bunch of interesting stuff about various aspects of this and 10G Ethernet in general. I found it usefully educational.