A quick look at some spam filtering stats from our system
April 26, 2011
It's been a while since I thought about generating statistics about what our anti-spam systems are doing and seeing, which probably means that it's about time to do it again. I'm going to look at the past week's statistics, mostly because we upgraded the spam filtering machine recently and we don't have old logs any more. Unfortunately this is not an ideal week to look at, since Friday was a holiday here so the numbers are going to be down from usual.
First, the disclaimers: not all spam makes it to our spam tagging and
filtering system. For example, some people immediately reject email from
IP addresses that are in the Spamhaus Zen list; since this rejects at
So, over the past seven days we saw:
This is well under the level of spam that most sources report. It's possible that our stats are skewed by various things; for example, it may be that most of the active targets of spam have opted in to spam rejection, and so spam to them never makes it to these numbers. (Trying to quantify the volume of rejections is a project for later.)
Our spam system gives messages a spam score from 0 to 100 (with some decimal points of precision allowed; theoretically this is some sort of probability measure). The breakdown of scores is somewhat interesting:
Our current threshold for calling something spam is 60 points or more. These numbers suggest that we could significantly raise the threshold without having a material effect on our spam filtering; on the other hand, since it would have no material effect there seems no reason to do it (other than possibly user perception, and I don't know if users pay any attention to this).
(Note that this is not the same system that I did my old spam stats for, and so if I do regular reports they are going to look different and not be comparable to the old numbers.)
* * *
Atom feeds are available; see the bottom of most pages.