The Spamhaus DBL does get hits even with basic checks

March 18, 2016

The Spamhaus DBL is unlike their other blocklists in that it is for host and domain names, not IP addresses. As Spamhaus describes it:

The Spamhaus DBL is a realtime database of domains (typically web site domains) found in spam messages. Mail server software capable of scanning email message body contents for URIs can use the DBL to identify, classify or reject spam containing DBL-listed domains.

The intended primary use of the DBL is for message body scanning; you'd identify the hosts mentioned in URLs or URL-like things and run them past the DBL. You can also use it to check hostnames that appear in envelope information, like MAIL FROM (and EHLO, and simply the DNS name), but the way Spamhaus has written it up suggests that this is not going to get very many hits.

(The DBL is not the only such domain based blocklist, of course.)

A while back I added DBL checking to my sinkhole SMTP server and then turned it on, checking all of the MAIL FROM domain, the EHLO name, and the reverse DNS of the connecting IP. I didn't really expect it to get any hits; I basically wanted to experiment. The result contained two surprises.

The first surprise was that even in my modest little context, I see more than a few DBL hits. It's nowhere near the level of the SBL in general (especially the SBL CSS), which I check first, but it does happen enough that it's easy to find rejections that are due to it. This suggests that I should look into using the DBL along side the SBL in our real mail system's spam filtering.

(I want to do some actual analysis there, but that'll be another entry.)

The second surprise is that a lot of the mail senders using DBL listed domains were and are sending from their own servers, and those servers were not listed in the SBL or in fact any of Spamhaus's IP based DNSBLs. Often these people seem to have been sending from the same IP address for quite a while. This is very much not what I expected; I expected that if you were a DBL listed operation, your sending servers would wind up listed in the SBL in short order for, well, sending spam. Instead I see a number of persistent DBL-listed senders with their own static server IPs who are (still) not SBL listed.

(Often the IP addresses aren't even on very many other DNS blocklists, at least out of the ones that I check these days.)

This matters to me because one of the reasons I expected the DBL to have a low (additional) hit rate for things like MAIL FROM checks was that I thought there would be a much bigger overlap between the SBL and the DBL than there is. This expectation of low hit rates is why I haven't really looked even simple DBL usage before now.

(The moral, obviously, is to validate my anti-spam feelings instead of just assuming. A more general moral is that I should think about general infrastructure for doing experiments to measure potential hit rates on things like this. Some amount of things can be looked at in retrospect based on logs, but not everything.)

Written on 18 March 2016.
« Some things I believe about importance and web page design
What broad hit rate the Spamhaus DBL might get for us »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Mar 18 22:20:21 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.