2009-11-19
The corollary for effective anti-spam heuristics
Last time I mentioned that spammers were
perfectly capable of adopting their practices to defeat anti-spam
heuristics like requiring a valid EHLO
or reverse DNS, and so such
heuristics were, if effective and widely adopted, at best a temporary
fix. This raises an obvious corollary about good anti-spam heuristics.
Since spammers will adopt when it is both useful and possible, a good anti-spam heuristic is some characteristic of the message or of how it is transmitted that the spammer cannot easily change. While people have made various stabs at this in the past (and will no doubt continue to do so in the future), the problem for anti-spam efforts is that such characteristics have been hard to find, partly because spammers have proven to be very ingenious about finding ways to change them.
(For a small example, are anti-spam systems matching on the characteristic phrases of your advance fee frauds in email? No problem, just put your pitches in file attachments. I await with resignation the day the spammers start sending PDFs, not just Word .doc files, since a sufficiently ingenious spammer can make a PDF that is very hard to analyse.)
I am not convinced that it's even theoretically possible to come up with good (under this definition) anti-spam heuristics in any sort of general environment, partly for reasons that run up against the fundamental spam problem.
(While current heuristics are effective, my strong impression is that they are a laboriously maintained and ever-evolving collection of more or less ad-hoc rules. This doesn't necessarily scale, and it's expensive.)