SMTP's crazy address formats didn't come from nowhere

June 5, 2014

Broadly speaking, SMTP addresses have two crazy things in them: route addresses and quoted local parts. Route addresses theoretically give you a way of specifying a chain of steps the message is supposed to take on its way to (or from) its eventual destination:

RCPT TO:<@a.ex.org,@barney:user@fred.dibney>

Quoted local parts allow you to use any random characters and character sequences in the local mailbox name:

MAIL FROM:<"abney <abdef> ...%barney"@example.org>

(As I grumbled about yesterday, quoted local addresses drastically increase the complexity of parsing modern SMTP commands.)

Here is the thing: these two features of SMTP addresses did not come from nowhere. When the very first SMTP RFCs were written, these features were necessary. Really.

Quoted local mailbox names have an obvious rationale: they accommodate systems that have local logins (or mailbox names) that do not fit into the simple allowable format that you can use without quoting. The obvious big case that needs this is any local mailbox with a space in the name. Today we don't do that (we tend to use dots), but I'm sure there were systems on the original ARPANet where people had mailbox names of 'Jane Smith' (instead of the Jane.Smith that we'd insist on today). I believe that one of the reasons for this is that people did not want to require a conversion layer in mailers between the true mailbox names (with spaces and funny characters) and the external, RFC-approved mailbox names that could be used in email.

(I can see at least one sensible reason for this: the less software that had to be written to get a system hooked up to ARPANet SMTP, the more likely that it would be and thus that ARPANet SMTP would actually get widely used.)

Equally, route addresses make a lot of sense in an environment where many systems are not directly on the ARPANet and no one has yet built the whole infrastructure of forwarding MTAs, internal versus external mail remapping, and indirect addressing in the form of MX entries. After all, the early SMTP RFCs predate DNS. Here the SMTP RFC is providing a way to directly express multi-hop mail forwarding, something that was a reality on the early ARPANet.

(SMTP route addresses were not the only form this took, of course. The '% hack' used to be very common, where 'a%b@c' implied that c would actually send the message on to a@b. And there were even more complicated fudges for more complex situations.)

Internet email and Internet email addresses are such a juggernaut today that it is easy to forget that once upon a time the world was smaller and SMTP mail was a scrappy upstart proposing a novel and unproven idea, one that had to interoperate with any number of existing systems if it wanted to have any chance of success.

(Note here that I'm talking exclusively of SMTP addresses, not the more complex soup that is how addresses appear in the headers of email messages.)


Comments on this page:

By cks at 2014-06-08 00:16:25:

Oops, right you are. This shows the perils of spell-checkers; a wrong but close word like 'rational' instead of 'rationale' will spell-check correctly and might pass proofreading (as this one did).

Thanks.

Written on 05 June 2014.
« Why I don't like SMTP command parameters
On the Internet, weirdness is generally uncommon »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Jun 5 01:58:32 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.