Attachment types that we see in email from Zen-listed IP addresses
As part of yesterday's entry, I broke down what percent of various sorts of attachments we received came from IPs listed in zen.spamhaus.org. Today I'm going to basically invert the question and ask instead what sort of attachment types we get sent from Zen-listed IPs. As before, I'm going to be using the past nine weeks and a bit of logs, because our weekly log rotation makes it easy to do that.
(Because our attachment type logging comes after our
RCPT TO time
rejections, this is all based on email to people who don't reject
all email from Zen-listed IPs.)
First up we have a collection of attachment types without MIME file names (and thus without MIME file extensions). For these I have to rely on the declared MIME types and sniffed file type information, and they break down like this:
102 [Word XML] 13 application/msword 11 image/jpeg 10 message/rfc822 10 application/xml [either Word or Excel] 6 [Excel XML] 1 image/gif
Possibly this means that I should recurse inside message/rfc822 MIME parts. Some of these were file attachments; others I believe were the sole component of the email message.
Of attachments with MIME file names, the type breakdown is:
1032 .doc 623 .docx 576 .html 545 .htm 308 .pdf 253 .zip 248 .xlsx 170 .xls 109 .jar 90 .jpg 83 .aspx 61 .7z 48 .ace 26 .r11 22 .xz 20 .gz 19 .tar 15 .r00 .png 14 .gif 11 .rar 10 .pdf.gz 6 .iso .arj 4 .pdf.z 3 .txt .jpeg .chm 1 .rtf .r01 .ppsx .lzh .bat
On the one hand, this is a broad assortment with a long tail. On
the other hand, there's some very popular attachment types, especially
Microsoft Word documents. I suspect that if I ground through our
logs to cross-correlate them, I'd discover that a lot of these were
seen as malware. Based on past discoveries, the
.htm are likely phish spam, perhaps with some malware
(All of the
.rars that we could successfully examine had
in them and got rejected on that basis.)
Those .zip archives break down as containing:
114 .exe 86 .zip 26 .jar 25 .vbs 1 ".lnk .txt" 1 .com
We rejected all the ZIP archives with
The inner .zip files are:
84 .doc 1 .scr 1 .js
It turns out that we rejected all of these. The
files got rejected by a generic 'in-zip' case, which is perfectly
happy to match nested zips as well as plain zips. The doubly nested
.doc files have been rejected for some time.
(It turns out that the few nested ZIPs in yesterday's entry that weren't from Zen-listed IPs
must have all been
PS: That I keep having to check what we're actually rejecting suggests that our attachment type rejection rules are now sufficiently complicated that I should actually write them down, instead of leaving them sort of implicitly documented in the Exim configuration and then trying to remember them (it turns out that I got at least one case wrong in yesterday's entry). Possibly this will cause us to regularize some of them. Probably we won't drop any.