When 'simple' DNS blocklists work well for you

July 27, 2016

I've written about how we can divide DNS blocklists into 'simple' and 'complex' ones, where simple DNSBLs basically list things based on them sending spam or other bad stuff without trying to do more complex things like assess how much legitimate traffic also comes from the source. To put it one way, if a DNSBL lists one of GMail's outgoing SMTP servers because it sent some spam, it's almost certainly a simple one. I also said that rejecting email based on a simple DNSBL isn't necessarily a mistake, so it's time to explain that.

Suppose that you have a mail system that generally receives a low volume of legitimate email; for example, you might be operating a personal email server. Suppose that you also start getting spam. Spammers almost never go away, so your spam volume is very likely to trend up over time and reach a point where most of your incoming email is spam. In this environment, a listing in a simple DNSBL is a fairly strong confirmation signal that this new email is really spam. It's much likely that you're getting spam email from an IP that's been detected as spamming than that an innocent person has chosen to send you legitimate email from an IP that also sent spam and got listed in the DNSBL. The latter could happen, but the odds are low.

We've sort of seen this before. If the legitimate email rate is low and the DNSBL's 'false positive' rate on it is also low, the odds that a positive signal from the DNSBL means that an email is spam is very high. You can make the odds even higher by whitelisting known good sources.

(Of course anti-spam precautions aren't evaluated purely on percentages; the absolute number of legitimate messages blocked matters. Here the low volume helps, as there just aren't that many legitimate emails to get blocked.)

Similar logic can be applied to a lot of anti-spam heuristics; many things look good when they're dealing with a stream of email that's mostly or almost entirely spam. Block on bad EHLO greetings? Sure, why not, especially since GMail and the other big people do generally get those things right.

(GMail will send you spam too, of course, but statistically a new legitimate sender is much more likely to be using GMail or one of the other big places than an email server in the middle of nowhere. And yes, there are downsides to too many people adopting this sort of attitude to both heuristics and new mail sending machines in surprising places; ask anyone trying to send personal email from a new small home mail server and get it accepted by places.)


Comments on this page:

By David at 2016-07-31 02:05:21:

Respectfully, I disagree with the premise that simple DNSBLs are useful. Four or five years ago I tried out virtually every notable DNSBL on my personal MTA and after a good deal of pain came away with only three sophisticated DNSBLs worth trusting. Obviously Spamhaus Zen is king of the hill, then Barracuda (too aggressive for many but works well for small relays per the flow-scale logic described in the post), and finally Hostkarma though this has been dropped due to significant false positives during the last year.

The classic, big-gun "simple" DNSBL is SORBS. SORBS could not care less if you are a Google, Microsoft or Yahoo MTA. If a spam or two appears, bam it's listed. I tried SORBS a couple of times and it always took less than a week for a false positive to obstruct an important message from someone I know. At one time SORBS DUHL was useful for blocking residential and small business dynamic IP addresses, but the Spamhaus PBL appeared and quickly demonstrated greater quality an depth.

Spammers have become shockingly good at circumventing virtually every and all countermeasures. For the last eighteen months I've run a pure whitelist email system and this is the only way to effectively block spam without running a dedicated third-party appliance. I generate new email address variants on a frequent schedule and whitelist response paths using a variety of approaches including MTA reverse-DNS, obfuscating-ESP account number and SPF-assured envelope sender. Contrary to the assertion of the post, new Google correspondents are much more likely to be spammers than real people and unknown GMailers are parked in soft-bounce limbo until I have a chance to look at the sending address. Just added logic to automatically report GMail addressed to one particular role-account straight to Google's abuse system. Unknown Outlook senders are simply hard-bounced.

Whitelisting has worked very well, but the shear volume of garbage connections became too much and overwhelmed the logs this year--almost as bad as the Rustock deluge of late 2010, so I spent a few days writing a "stupid MTA" filter that helpfully eliminates roughly 95% of it. This filter checks for a good non-generic reverse-DNS, matching forward-DNS, and that the IP is not listed on Spamhaus Zen or Barracuda. If any of the these tests fail the TCP connection SYN is ignored and much log noise thus avoided.

It's so bad now that if I could I would cease using email altogether. Sender-escrow / sender-pays never happened, but despite a variety of objections and pitfalls would solve the problem decisively.

Written on 27 July 2016.
« An irritating systemd behavior when you tell it to reboot the system
A bit about what we use DTrace for (and when) »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Wed Jul 27 01:09:35 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.