Wandering Thoughts archives

2015-10-05

How many recent sender domains are in the Spamhaus DBL

The Spamhaus DBL is, well, let's quote it directly:

The Spamhaus DBL is a realtime database of domains (typically web site domains) found in spam messages. [...]

Per Spamhaus's documentation, the recommended or best way of using the DBL is to check URLs in incoming messages against it. However you can also use it to check domain names from other sources, such as DNS hostnames, EHLO claimed names, and the host or domain name in the envelope sender address (the SMTP MAIL FROM).

For reasons beyond the scope of this entry, I got curious about how many of the domains sending us email over the recent past might be (still) listed in the DBL. To get a rough idea of this, I extracted the sender domain for all accepted email on our external MX gateway for roughly the past ten days and checked them all. The headline results surprised me:

Out of 10,397 different sending domains, 1,422 were on the DBL.

This is a lot more than I expected. Note that this is a count of domains, not email volume; to put it one way, 'gmail.com' is one domain just as 'aftencia.review' is, but the former is sending us many more email messages than the latter.

Since this is email the gateway accepted, it excludes email that was rejected during the SMTP conversation for various reasons. I've noticed that there's a fairly decent correlation between SBL listed IPs and DBL listed sender domains (eg many IPs that are on the SBL CSS seem to use MAIL FROMs that are in the DBL, probably unsurprisingly).

(I'm presenting such relatively odd numbers because it's much more work to get more interesting ones, such as what percentage of accepted email messages those DBL-listed senders are responsible for. Crude shell scripts don't make what are effectively cross-table joins very easy. Also, I started out expecting a very low DBL hit rate, which would have made detailed stats fairly pointless.)

PS: While there were quite a number of new TLDs in the DBL listed domains, it turns out that the three most common TLDs were .com, .net, and .eu (followed by .download and .xyz). However, somewhat over half of the .net domains come from .in.net; if considered separate from .net, it would be the the fourth most common 'TLD' (and .net would drop out of the top five).

spam/CSLabSpamhausDBLHits2015-10-05 written at 21:54:16; Add Comment

I don't trust Linux distributions to leave directories alone

In yesterday's entry I said in passing:

My view is that basically every directory that your OS distribution creates is best left alone and unused, and thus should be left on the root filesystem. [...]

In theory there are any number of directories on typical Linux distributions (and typical Unix distributions in general) that should be safe for you to use without disturbance by the OS. There's things like /usr/local, /home, /opt, and yes, some of you are laughing right now. In practice, I've been through enough experiences that I no longer trust Linux distributions to leave any directories they know about alone. Sooner or later someone is going to drop files or subdirectories in there, or change the permissions or SELinux context, or mandate that they must be on the root filesystem because of some requirement, and so on and so forth. Sometimes the guilty party will be the OS itself; sometimes it will be third parties who are packaging things for the OS and decide that /opt or /usr/local or whatever make a great place to put their stuff.

The practical reality of modern Linux life is that the only directories you can trust the OS not to screw with are directories that the OS has no idea exist, ie ones that you make up and create yourself. If the OS creates it, even if it's empty and explicitly marked 'for local sysadmin use only', using it is dangerous in practice. Sooner or later you're likely to regret it.

(Sometimes you have no choice because a program has been configured to look there or restrict itself to things there.)

Since directory names for local things are generally arbitrary anyways, you should make your life simpler and pick your own new names (I suggest organization-based ones).

The one exception to this is that if you package things in the distribution's native packaging scheme (.debs, RPMs, etc), my strong opinion is that you should default to putting them into the normal system locations even if it's local software. Sometimes this won't be possible (eg if you're packaging a conflicting version of a program), but when it is I think it's going to make your life easier. And as I've found out, there are things that really want to use the system locations.

linux/DistroDirectoryDistrust written at 01:26:37; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.