The corollary for effective anti-spam heuristics

November 19, 2009

Last time I mentioned that spammers were perfectly capable of adopting their practices to defeat anti-spam heuristics like requiring a valid EHLO or reverse DNS, and so such heuristics were, if effective and widely adopted, at best a temporary fix. This raises an obvious corollary about good anti-spam heuristics.

Since spammers will adopt when it is both useful and possible, a good anti-spam heuristic is some characteristic of the message or of how it is transmitted that the spammer cannot easily change. While people have made various stabs at this in the past (and will no doubt continue to do so in the future), the problem for anti-spam efforts is that such characteristics have been hard to find, partly because spammers have proven to be very ingenious about finding ways to change them.

(For a small example, are anti-spam systems matching on the characteristic phrases of your advance fee frauds in email? No problem, just put your pitches in file attachments. I await with resignation the day the spammers start sending PDFs, not just Word .doc files, since a sufficiently ingenious spammer can make a PDF that is very hard to analyse.)

I am not convinced that it's even theoretically possible to come up with good (under this definition) anti-spam heuristics in any sort of general environment, partly for reasons that run up against the fundamental spam problem.

(While current heuristics are effective, my strong impression is that they are a laboriously maintained and ever-evolving collection of more or less ad-hoc rules. This doesn't necessarily scale, and it's expensive.)

Written on 19 November 2009.
« Universities are open environments
Spam and the attraction of reach »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Nov 19 00:47:39 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.