How many recent sender domains are in the Spamhaus DBL

October 5, 2015

The Spamhaus DBL is, well, let's quote it directly:

The Spamhaus DBL is a realtime database of domains (typically web site domains) found in spam messages. [...]

Per Spamhaus's documentation, the recommended or best way of using the DBL is to check URLs in incoming messages against it. However you can also use it to check domain names from other sources, such as DNS hostnames, EHLO claimed names, and the host or domain name in the envelope sender address (the SMTP MAIL FROM).

For reasons beyond the scope of this entry, I got curious about how many of the domains sending us email over the recent past might be (still) listed in the DBL. To get a rough idea of this, I extracted the sender domain for all accepted email on our external MX gateway for roughly the past ten days and checked them all. The headline results surprised me:

Out of 10,397 different sending domains, 1,422 were on the DBL.

This is a lot more than I expected. Note that this is a count of domains, not email volume; to put it one way, 'gmail.com' is one domain just as 'aftencia.review' is, but the former is sending us many more email messages than the latter.

Since this is email the gateway accepted, it excludes email that was rejected during the SMTP conversation for various reasons. I've noticed that there's a fairly decent correlation between SBL listed IPs and DBL listed sender domains (eg many IPs that are on the SBL CSS seem to use MAIL FROMs that are in the DBL, probably unsurprisingly).

(I'm presenting such relatively odd numbers because it's much more work to get more interesting ones, such as what percentage of accepted email messages those DBL-listed senders are responsible for. Crude shell scripts don't make what are effectively cross-table joins very easy. Also, I started out expecting a very low DBL hit rate, which would have made detailed stats fairly pointless.)

PS: While there were quite a number of new TLDs in the DBL listed domains, it turns out that the three most common TLDs were .com, .net, and .eu (followed by .download and .xyz). However, somewhat over half of the .net domains come from .in.net; if considered separate from .net, it would be the the fourth most common 'TLD' (and .net would drop out of the top five).

Written on 05 October 2015.
« I don't trust Linux distributions to leave directories alone
The irritation of all of the Ubuntu kernels you wind up with »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Oct 5 21:54:16 2015
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.