Operating spam and malware filtering is ultimately a social problem

November 30, 2019

Successfully filtering spam and malware is a technical issue, full of problems like recognizing new sorts of spam and malware, developing recognition rules, and not having your servers eaten by expensive code (even when people send you gigantic files, or malicious ones such as compressed archives that expand hugely or endlessly). However, operating spam and malware filtering is ultimately a social problem, because the people you are doing the filtering for need to be happy with what your system does and how you operate it.

No spam and malware filtering can be perfect, because of the fundamental problem of spam. This means that all filtering in operation is a tradeoff between rejecting good email that looks too suspicious and letting in too much bad email because it doesn't look sufficiently clearly suspicious (or because you can't recognize it yet). Where you set this for various sorts of email ultimately comes down to what your users want and will accept, and also what sorts of email they get from where.

(You may also have to force certain sorts of anti-malware filtering on people regardless of what they feel about, because the risks are too high and the malware recognition too imperfect. One manifestation of this is how GMail and many other places reject a whole raft of attachment types despite potential valid uses for a number of them. A place with high enough security needs and concerns might reject all Microsoft Office attachments in email and tell outside people 'upload them to our upload service here instead'; this would be inconvenient for everyone, but inconvenience versus security is another social problem and tradeoff.)

One corollary to this is that perceptions matter even if the ultimate outcomes are the same, because perceptions are part of what drive people's reactions to how your spam filtering works. For example, having spam filtering that is a black box that's impossible for you to tune is different from having spam filtering with a lot of adjustments, and we can't say that one is universally better or worse in the large scale. If you can't tune your spam filtering, on the one hand your users can't demand that you constantly tune it to deal with small issues but on the other hand you may have to completely throw it away if significant issues come up. If you can tune, actually tuning it may be considered one of your responsibilities (and people will blame you if you could but didn't), but you may have a greater ability to deal with significant problems.

(In practice, spam levels and scores are a mostly a copout because most people do not want to be tuning your spam filtering; they want it to just work.)

(This is an obvious observation and it's been in the back of my mind for some time, but for various reasons I feel like writing down explicitly.)

Written on 30 November 2019.
« Counting the number of distinct labels in a Prometheus metric
Calculating usage over time in Prometheus (and Grafana) »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Nov 30 20:33:01 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.