The correlation between Spamhaus Zen listings and attachment types (March 2018 edition)
Our program to capture information about what sort of email attachments our users get logs not just the attachment information but also whether or not the sending IP address was listed in zen.spamhaus.org at the time. For reasons beyond the scope of this entry, today I want to look at the correlation between sending us attachments and being in Spamhaus Zen, and what attachment types are popular. Because it's the most convenient option, I'm going to use four weeks of recent logs.
Over this time we logged 15,900 incoming messages with attachments (20,395 attachments total), although given my previous experience it's possible that some of these are repeated attempts from misbehaving senders that believe permanent SMTP rejections are just temporary. 3,890 of these messages (3,925 attachments total) were in the Spamhaus Zen at the time, or about one quarter, which is neither huge nor insignificant. The most popular attachment types for Zen listed IPs to send us are as follows:
1312 MIME file ext: .html [89%] 754 MIME file ext: .docx [38%] 516 MIME file ext: .doc [59%] 352 MIME file ext: .xlsx [51%] 120 MIME file ext: .xls [54%] 85 MIME file ext: .pdf [ 1%] 57 MIME file ext: .pdf.gz [87%] 42 MIME file ext: .ace [28%]
(For simplicity I'm looking only at things with MIME file extensions. 705 attachments in total, 435 from Zen-listed IPs, did not have file extensions. Almost all of the Zen-listed ones were Microsoft Word documents, usually .docx.)
The percentages are against the total number of that attachment type we received. PDFs by far our most popular attachment type in general, followed by .docx, .html (almost all from Zen listed IPs), JPGs, .doc, .png, and .xlsx.
The '.pdf.gz' attachments are actually all .exe files in disguise, which we reject. I'm not sure why malware tries this, but presumably it works on some people and some systems. The .html attachments are very likely to be what our commercial spam filtering system scores as 'Mal/Phish', because this is a pattern we see all the time. We reject all .ace attachments in general as they're all malware, so I find it interesting that only 30% of them come from Zen listed IPs; there seem to be a fair number of malware senders that aren't in Zen.
(This is something to bear in mind if you feel that the Zen alone will do a great job of protecting you. I'm not saying it won't; it depends on what you're worried about and what the attack patterns are against you.)
With the exception of .html (and the special case of .pdf.gz), there's no attachment type that is clearly beloved of Zen-listed IPs, although they're over-represented in a number of them (where they're roughly half of those attachment types despite being only roughly a quarter of our attachment volume). This seems mostly likely to be due to relatively low usage by legitimate senders rather than high usage by Zen-listed IPs. In turn this is probably because of the profile of our users probably tilts away from the use of Microsoft Office files.
Because of how we do server side spam filtering,
some amount of Zen-listed IP addresses that would send us attachments
don't make it this far, because they get entirely rejected at
TO time. It's difficult to estimate how many such rejections we
might have, so I'm not going to guess or try to throw raw numbers
PS: If I'm understanding the logs correctly, the number of Zen-listed IPs that sent us attachments is a drop in the bucket compared to the total number of Zen-listed IPs that got as far as submitting email. My log analysis suggests that there were roughly 77,500 such email submissions over the same time period, from 18,800 different IPs.
(See also the related attachment types we see in email from Zen-listed IP addresses, from last December. Some of the patterns have clearly shifted since then. For the absolute numbers, note that I did nine weeks of data then and I'm doing four weeks now.)
Sidebar: Some people are confused about MIME types
We got two messages with JPGs that were attached with the MIME type
*/*', which is not how MIME types work. Sadly I suspect that
many mail clients will display the JPGs anyway, because that's how
the Internet works (and it's not necessarily a bad thing, and even
when it is it's hard to persuade people of that).
(Someone may have been thinking of browsers when they generated that MIME type, or they may just have been refusing to even try.)