Email providers cannot stop spam by scanning outgoing email

May 25, 2015

One of the things that Amazon SES advertises that it (usually) does is that it scans the outgoing email that people send through it to block spam. This sounds great and certainly should mean that Amazon SES emits very low levels of spam, right? Well, no, not so fast. Unfortunately, no outgoing mail scanning on a service like this can eliminate spam. All it can do is stop certain sorts of obvious spam. This is intrinsic in the definition of 'spam' and the limitations of what a mail sending system like Amazon SES does.

Essentially perfect content scanning can tell you two things: whether the email has markers of known types of spam, such as phish, advance fee fraud, malware distribution, and so on, and whether the email will be be scored as spam by however many spam scoring systems you can get your hands on the rules for. These are undeniably useful things to know (provided that you act on them), but messages that fail these tests are far from the only sorts of spam. In particular, basically all sorts of advertising and marketing emails cannot be blocked by such a system because what makes these messages spam is not their content, it's that they are unsolicited (cf, cf).

The only way to even theoretically tell whether a message is solicited or unsolicited is to control not just the sending of outgoing email but the process of choosing destination email addresses. If you only scan messages but don't control addresses, you have very little choice but to believe the sender when they tell you 'honest, all of these addresses want this email'. And then the marketing department of everyone and sundry descends on Amazon SES with their list of leads and prospects and people to notify about their very special whatever it is that of course everyone will be interested in, and then Amazon SES is sending spam.

(Or the marketing people buy 'qualified email addresses' from spam providers because why not, you could get lucky.)

There is absolutely nothing content filtering can do about this. Nothing. You could have a strong AI reading the messages and it wouldn't be able to stop all of the UBE.

(I wrote a version of this as a comment reply on my Amazon SES entry but I've decided it's an important enough point to state and elaborate in an entry.)

Written on 25 May 2015.
« A mod_wsgi problem with serving both HTTP and HTTPS from the same WSGI app
One of the problems with 'you should submit a patch' »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon May 25 00:15:37 2015
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.