Thinking through issues a mail client may have with SMTP-time rejections
In response to my entry on who holds problematic email, Evaryont left a comment lamenting the general 'don't trust the user's mail client' approach of mail submission servers accepting more or less everything from clients and bouncing things later. This has prompted me to try to think about the issues involved, so today you get this entry in reaction.
First, the basics. We already do our best to verify SMTP sender addresses and immediately reject bad ones, for good reasons, and I hope that there's general agreement on this for sender addresses, so this is only about how we (and other people) should react to various sorts of bad destination addresses (SMTP RCPT TOs). For 'bad' destination addresses, there are two different cases; sometimes you know that the destination address is bad and could give a permanent SMTP rejection, and sometimes you can't verify the destination address and would give a SMTP temporary failure (a 4xx deferral).
For permanent rejections, the question is whether the user's mail client will do a better job of telling them about these bad addresses than your bounce message does. This is a straightforward question of UI design (and whether the mail clients expect rejection at all, which it really should and these days almost certainly does). In theory a mail client can do a substantially better job of helping the user deal with bad addresses than a bounce can; for example, the mail client could immediately abort message submission entirely, report to the user 'these addresses are bad, I have marked them in red for you', and give the user the opportunity to correct or remove them before (re-)sending the message. In practice a bounce may give the user a better record of the failures than, say, a temporary popup dialog box about 'these addresses failed' that gives them no way to record the information or react to it.
(Correcting or removing the bad addresses before the message is sent at all is an overall better experience for everyone involved in the email; consider future replies to all addresses, for example. Bounces are also much less convenient for correcting bad addresses and resending your message, since there's no straightforward path from the bounce to sending a new copy of the original to corrected addresses.)
For temporary deferrals, things get a lot more complicated in both mail client handling and UI design. Some temporary deferrals will be cured in time; if they are pushed back to the client, the client must maintain its own queue of messages and addresses to be re-submitted, manage scheduling further delivery attempts, and decide on warning messages after sufficient time. For many clients this is going to be complicated by intermittent and perhaps variable connectivity (where you're on the Internet but you can't talk to the original mail submission server). Some temporary deferrals will never be cured in time, and for them the client also has to eventually give up and somehow present this to the user to do something (alternately, the client can just let the user keep trying endlessly until the user themselves clicks a 'stop trying, give up' button). Notifying the user at all about initial temporary deferrals is potentially a bad idea, especially with an intrusive alert; unlike permanent rejections, this is not something the user really needs to deal with right away.
(The mail client could also immediately abort message submission when it gets temporary deferrals and give the user a chance to change or remove addresses, but it's not clear that this is the right choice. There are a lot of things that can cause curable temporary deferrals (in fact some may be cured in seconds, when DNS results finally show up and so on), and you probably don't want to not send your message to such addresses.)
Reliably maintaining queues and handling retries is fairly complicated, especially for a mail client that may only be run intermittently and have network connectivity only some of the time. My guess is that mail servers are probably in a much better position to do this most of the time, and for temporary deferrals that will be rapidly cured (for example, ones caused by slow to respond DNS servers) a mail server will probably get the message delivered sooner. Also, when the mail client is the one to handle temporary deferrals, it's going to wind up having to send (much) more data over its connection, especially if the message has multiple temporarily deferred destinations and they cure themselves at different times. Having the server handle all retries means that the server holds the message and the mail client only has to upload it to the server once.
(On mobile devices these extra message submissions are also going to burn more battery power, as Aristotle Pagaltzis noted in the context of excessive web page fetching in a comment on this recent entry.)
One tradeoff in email system design is who holds problematic email
When you design parts of a mail system, for example a SMTP submission server that users will send their email out through or your external MX gateway for inbound email, you often face a choice of whether your systems should accept email aggressively or be conservative and leave email in the hands of the sender. For example, on a submission server should you accept email from users with destination addresses that you know are bad, or should you reject such addresses during the SMTP conversation?
In theory, the SMTP RFCs combined with best practices give you an unambiguous answer; here, the answer would be that clearly the submission server should reject known-bad addresses at SMTP time. In practice things are not so simple; generally you want problematic email handled by the system that can do the best job of dealing with it. For instance, you may be extremely dubious about how well your typical mail client (MUA) will handle things like permanent SMTP rejections on RCPT TO addresses, or temporary deferrals in general. In this case it can make a lot of sense to have the submission machine accept almost everything and sort it out later, sending explicit bounce messages to users if addresses fail. That way at least you know that users will get definite notification that certain addresses failed.
A similar tradeoff applies on your external MX gateway. You could insist on 'cut-through routing', where you don't say 'yes' during the initial SMTP conversation until the mail has been delivered all the way to its eventual destination; if there's a problem at some point, you give a temporary failure and the sender's MTA holds on to the message. Or you could feel it's better for your external MX gateway to hold inbound email when there's some problem with the rest of your mail system, because that way you can strongly control stuff like how fast email is retried and when it times out.
Our current mail system (which is mostly described here) has generally been biased towards holding the email ourselves. In the case of our user submission machines this was an explicit decision because at the time we felt we didn't trust mail clients enough. Our external MX gateway accepted all valid local destinations for multiple reasons, but a sufficient one is that Exim didn't support 'cut-through routing' at the time so we had no choice. These choices are old ones, and someday we may revisit some of them. For example, perhaps mail clients today have perfectly good handling of permanent failures on RCPT TO addresses.
(A accept, store, and forward model exposes some issues you might want to think about, but that's a separate concern.)
(We haven't attempted to test current mail clients, partly because there are so many of them. 'Accept then bounce' also has the benefit that it's conservative; it works with anything and everything, and we know exactly what users are going to get.)