Sometimes, not trying to reject some sort of spam is the right answer

March 30, 2018

I've written before about not doing anything about a temporary spate of spam, and it remains a useful guideline. But sometimes you're pretty convinced that certain spam patterns are long-standing, and it turns out that the right answer is still to not do anything, however reluctantly. As it happens, I have an example that we recently decided on.

One of the patterns we observe is that a decent amount of the attachments we get come from IPs listed in the Spamhaus Zen DNSBL. A further pattern we've seen is that a decent amount of those are detected as malware (see eg this), and we've also seen that there are some highly active Zen-listed sources (see this set of numbers from January). Given all of this, I recently put forward the idea of rejecting all messages from Zen-listed IPs that had an attachment, for the same broad reason that we reject some sorts of attachments; we're almost completely sure that these emails are bad and they're often dangerous, but our commercial anti-spam package may not pick the malware up on its own and cause us to reject them.

When I put it that way, this probably sounds good, and certainly that's how I thought of the idea when I proposed it. Then I put together some numbers, based on how many messages we would actually be shielding users from if we did this. It turned out that many of the messages were already being rejected and almost all of the remaining messages were already being scored as spam (and when I say 'almost all', I mean 816 out of 820).

We had a long discussion and decided that we weren't going to reject these messages. There are local reasons for why not that I'm not going to get into, but apart from them there is a larger one that caused me to not argue too hard for the rejections, which is that this doesn't seem like something with a high payoff in practice. It's not just that the volume is not huge; it's also that basically everything is already being detected as bad (and at least some of our users are discarding the email based on that).

There's an almost infinite set of things that you could do to reduce spam, with some payoff (and many with a reasonably worthwhile one). The challenge about anti-spam work is not finding things to do to reduce spam, it is partly in not doing things, because every thing you do has a cost that goes with its benefits. Sometimes that cost is too high relative to the gain, and it's not because the particular sort of spam is temporary; it's because the sort of spam is already being blocked well enough as it is, even though you could do better.

Sure, some of our users could ignore the 'this is probably spam' warnings and fall for malware that we allowed to be delivered to them. There could even be bad stuff in those four email messages that weren't scored as spam (to be honest, there probably was at least spam). But our existing system is doing well enough even though it's not perfect, and it's already complicated enough. So doing nothing this time is the right answer.

(It helps here that in the past I've enthusiastically put in some clever anti-spam trick, only to have it make somewhat less impact than I was hoping for. That's not a good feeling either.)

Written on 30 March 2018.
« The problem with Linux's 'predictable network interface names'
My current set of Firefox Quantum (57+) addons »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Mar 30 01:44:00 2018
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.