Wandering Thoughts archives

2008-09-03

Why SMTP needs a way of communicating partial success for message delivery

As illustrated in yesterday's entry, one of the problems of the (E)SMTP protocol is that after a SMTP server has accepted all of the message's destination addresses and gets to see the actual message, it has no way to tell the client that the message was delivered to only some of those addresses. This decision made perfect sense at the time that SMTP and then ESMTP were being created, because back then most of the plausible per-address problems could be detected at RCPT TO time and if not, well, you could just send a bounce message. These days it is now an inconvenient limitation and ESMTP could really use an extension that added more smarts to the process.

As shown yesterday, you can sort of fake it by selectively deferring some RCPT TOs, forcing the sender to break the destination addresses up into chunks of your choice (the ultimate version of this is to only accept one destination address per transaction). The problem with this is that the 'reject without enabling dictionary scanning' case is actually the simple one, because you know before you see the message body what the real answer for each address is; this lets you make an immediate decision about how to force the sender to break up the addresses.

Consider a politically complicated environment, where some people just want their email tagged, some people only want to reject email that is all but certain to be spam, and some people are willing to reject more widely. Here you don't know how you want to group the addresses until you've seen the message body, by which time it's too late.

While you can force the sender to split the addresses into groups by the type of filtering (if any) that the person has opted in to, the problem is that this forces the split on every email message, even the ones that don't need it, which makes things increasingly complicated and inefficient (and you are relying on mailers reacting sensibly, where by 'sensibly' you really mean 'the way you want them to'). One unwelcome effect is that users will probably get even their good email more slowly, as legitimate sending mailers get confused by your forced retries.

(Of course, this just brings up the thought that ESMTP could also use an extension to let the server advertise the recommended retry interval on any temporary failure. Increasingly the server has very definite ideas about this; either it wants you to retry very fast, or it knows that there is no point in you retrying before, say, half an hour because you'll just get another temporary failure. Some servers even put this sort of information in the text portion of their 4xx replies, which is at least very useful for sysadmins as we try to figure out why outgoing email to somewhere is being delayed.)

tech/SMTPPartialSuccessNeeded written at 23:56:25; Add Comment

How to reject at SMTP time without enabling dictionary scanning

One claimed problem with rejecting unknown local addresses at SMTP time is that it enables a spammer to do a cheap dictionary scan of your domain to find valid usernames; all they have to do is try a bunch of RCPT TOs and see which ones get accepted (or don't get a permanent failure). The easiest way around this is simple: just do your greylisting before you give permanent rejections.

This doesn't completely block dictionary scanning (or other versions of address scanning), but it does force the spammer to do significantly more work. Their scanner now needs to be a multi-pass system that keeps a database, retries periodically, and so on. And it takes them longer to scan your domain (especially if you extend the overall greylisting time when mail sources try to hit a significant number of nonexistent users).

If you want to make it more difficult (or can't do greylisting for whatever reason), only give your rejections after you see the message body. However, in this case you'll need to come up with some way of correctly handling messages to a mixture of good and bad local addresses.

Sidebar: the mixed address problem

The problem comes up because your reply to the message body applies to all of the recipient addresses. If some of them are good but some of them are bad, there is no single reply that can be correct; both accepting and rejecting the message is lying about some destinations. You need to somehow contrive that you only accept recipient addresses of a single type, ideally without giving too much information away to a dictionary scanner.

One obvious solution is to keep track of what sort of destination addresses you've seen so far during the RCPT TO processing. When the type changes (when you see the first bad address after good ones, or the first good address after bad ones), you immediately give 4xx temporary failures for it and all future RCPT TOs. Proper mailers will apply the result of the actual message delivery (whether acceptance or rejection) to only the addresses you actually accepted, and retry the other RCPT TOs later.

(You can accept all RCPT TOs for the same type of address, so if the first destination address is a bad one all good ones get a 4xx and all bad ones are accepted (and vice versa if the first one was a good address), but this risks leaking information to a clever dictionary scanner that can notice this pattern.)

spam/RejectingWithoutScanning written at 02:05:11; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.