== How many recent sender domains are in the Spamhaus DBL The [[Spamhaus DBL https://www.spamhaus.org/dbl/]] is, well, let's quote it directly: > The Spamhaus DBL is a realtime database of domains (typically web site > domains) found in spam messages. [...] Per Spamhaus's documentation, the recommended or best way of using the DBL is to check URLs in incoming messages against it. However you can also use it to check domain names from other sources, such as DNS hostnames, _EHLO_ claimed names, and the host or domain name in the envelope sender address (the SMTP _MAIL FROM_). For reasons beyond the scope of this entry, I got curious about how many of the domains sending [[us https://www.cs.toronto.edu/]] email over the recent past might be (still) listed in the DBL. To get a rough idea of this, I extracted the sender domain for all accepted email on [[our external MX gateway ../sysadmin/AddingMailGateway]] for roughly the past ten days and checked them all. The headline results surprised me: > Out of 10,397 different sending domains, 1,422 were on the DBL. This is a lot more than I expected. Note that this is a count of domains, not email volume; to put it one way, 'gmail.com' is one domain just as 'aftencia.review' is, but the former is sending us many more email messages than the latter. Since this is email the gateway accepted, it excludes email that was rejected during the SMTP conversation for [[various reasons CSLabSpamFilteringII]]. I've noticed that there's a fairly decent correlation between SBL listed IPs and DBL listed sender domains (eg many IPs that are on [[the SBL CSS https://www.spamhaus.org/css/]] seem to use _MAIL FROM_s that are in the DBL, probably unsurprisingly). (I'm presenting such relatively odd numbers because it's much more work to get more interesting ones, such as what percentage of accepted email messages those DBL-listed senders are responsible for. Crude shell scripts don't make what are effectively cross-table joins very easy. Also, I started out expecting a very low DBL hit rate, which would have made detailed stats fairly pointless.) PS: While there were quite a number of new TLDs in the DBL listed domains, it turns out that the three most common TLDs were .com, .net, and .eu (followed by .download and .xyz). However, somewhat over half of the .net domains come from .in.net; if considered separate from .net, it would be the the fourth most common 'TLD' (and .net would drop out of the top five).