One tradeoff in email system design is who holds problematic email

June 25, 2017

When you design parts of a mail system, for example a SMTP submission server that users will send their email out through or your external MX gateway for inbound email, you often face a choice of whether your systems should accept email aggressively or be conservative and leave email in the hands of the sender. For example, on a submission server should you accept email from users with destination addresses that you know are bad, or should you reject such addresses during the SMTP conversation?

In theory, the SMTP RFCs combined with best practices give you an unambiguous answer; here, the answer would be that clearly the submission server should reject known-bad addresses at SMTP time. In practice things are not so simple; generally you want problematic email handled by the system that can do the best job of dealing with it. For instance, you may be extremely dubious about how well your typical mail client (MUA) will handle things like permanent SMTP rejections on RCPT TO addresses, or temporary deferrals in general. In this case it can make a lot of sense to have the submission machine accept almost everything and sort it out later, sending explicit bounce messages to users if addresses fail. That way at least you know that users will get definite notification that certain addresses failed.

A similar tradeoff applies on your external MX gateway. You could insist on 'cut-through routing', where you don't say 'yes' during the initial SMTP conversation until the mail has been delivered all the way to its eventual destination; if there's a problem at some point, you give a temporary failure and the sender's MTA holds on to the message. Or you could feel it's better for your external MX gateway to hold inbound email when there's some problem with the rest of your mail system, because that way you can strongly control stuff like how fast email is retried and when it times out.

Our current mail system (which is mostly described here) has generally been biased towards holding the email ourselves. In the case of our user submission machines this was an explicit decision because at the time we felt we didn't trust mail clients enough. Our external MX gateway accepted all valid local destinations for multiple reasons, but a sufficient one is that Exim didn't support 'cut-through routing' at the time so we had no choice. These choices are old ones, and someday we may revisit some of them. For example, perhaps mail clients today have perfectly good handling of permanent failures on RCPT TO addresses.

(A accept, store, and forward model exposes some issues you might want to think about, but that's a separate concern.)

(We haven't attempted to test current mail clients, partly because there are so many of them. 'Accept then bounce' also has the benefit that it's conservative; it works with anything and everything, and we know exactly what users are going to get.)


Comments on this page:

I wonder if it's because of that instinct to be conservative which causes the whole ecosystem to suffer. In my mind I would expect and prefer that the MUAs are the ones who handle invalid email. However, I expect that there is a sufficient number of servers configured like yours which will accept most mail and send bounce messages on failures. Thus, it's a large enough percentage that mail client developers don't think it's worth implementing a local queue, so they don't, so more mail servers have to behave like this, so fewer mail client developers do so, etc etc; A spiral.

This is an awfully pessimistic view of developers, but I know we're all quite lazy. So given the opportunity, I'd expect that many would skip it. I would, at first.

By cks at 2017-06-25 23:30:48:

You got me to do some thinking about the whole issue, which I've now written down in MUAIssuesWithRejection. The short version is that I've realized there are two cases, permanent failures and temporary deferrals, and each raises different issues for mail clients (and mail servers). Even in an ideal world with perfect mail clients, I'm not sure we want the same mail server behavior for both cases.

I configured the MSA on my last production server to only validate credentials and that the the sender was authorized to use the SMTP envelope & from headers.

Then I would have the MSA send a bounce back to the (authenticated) sender if there was a problem.

I chose to do this for a few different reasons:

  1. (originally) my MSA virus scanner was timing out waiting on clients to send attached files on slow connections.
  2. I found I had more (consistent) control over the DSN than what the MUA would present.
  3. I found I could offer different spam filtering between MSA & MTA than I could between MUA & MSA.

In short, I decided that it was okay to accept messages from authenticated senders where I knew that I could send a DSN. Further, I knew the sender that I was dealing with and had a non-email business relationship with them.

Written on 25 June 2017.
« In praise of uBlock Origin's new 'element zapper' feature
Thinking through issues a mail client may have with SMTP-time rejections »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jun 25 01:01:13 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.