Attachment types that we see in email from Zen-listed IP addresses

December 19, 2017

As part of yesterday's entry, I broke down what percent of various sorts of attachments we received came from IPs listed in zen.spamhaus.org. Today I'm going to basically invert the question and ask instead what sort of attachment types we get sent from Zen-listed IPs. As before, I'm going to be using the past nine weeks and a bit of logs, because our weekly log rotation makes it easy to do that.

(Because our attachment type logging comes after our RCPT TO time rejections, this is all based on email to people who don't reject all email from Zen-listed IPs.)

First up we have a collection of attachment types without MIME file names (and thus without MIME file extensions). For these I have to rely on the declared MIME types and sniffed file type information, and they break down like this:

   102 [Word XML]
    13 application/msword
    11 image/jpeg
    10 message/rfc822
    10 application/xml [either Word or Excel]
     6 [Excel XML]
     1 image/gif

Possibly this means that I should recurse inside message/rfc822 MIME parts. Some of these were file attachments; others I believe were the sole component of the email message.

Of attachments with MIME file names, the type breakdown is:

  1032 .doc
   623 .docx
   576 .html
   545 .htm
   308 .pdf
   253 .zip
   248 .xlsx
   170 .xls
   109 .jar
    90 .jpg
    83 .aspx
    61 .7z
    48 .ace
    26 .r11
    22 .xz
    20 .gz
    19 .tar
    15 .r00 .png
    14 .gif
    11 .rar
    10 .pdf.gz
     6 .iso .arj
     4 .pdf.z
     3 .txt .jpeg .chm
     1 .rtf .r01 .ppsx .lzh .bat

On the one hand, this is a broad assortment with a long tail. On the other hand, there's some very popular attachment types, especially Microsoft Word documents. I suspect that if I ground through our logs to cross-correlate them, I'd discover that a lot of these were seen as malware. Based on past discoveries, the .html and .htm are likely phish spam, perhaps with some malware mixed in.

(All of the .rars that we could successfully examine had .exes in them and got rejected on that basis.)

Those .zip archives break down as containing:

   114 .exe
    86 .zip
    26 .jar
    25 .vbs
     1 ".lnk .txt"
     1 .com

We rejected all the ZIP archives with .exe or .vbs payloads.

The inner .zip files are:

    84 .doc
     1 .scr
     1 .js

It turns out that we rejected all of these. The .scr and .js files got rejected by a generic 'in-zip' case, which is perfectly happy to match nested zips as well as plain zips. The doubly nested .doc files have been rejected for some time.

(It turns out that the few nested ZIPs in yesterday's entry that weren't from Zen-listed IPs must have all been .doc files.)

PS: That I keep having to check what we're actually rejecting suggests that our attachment type rejection rules are now sufficiently complicated that I should actually write them down, instead of leaving them sort of implicitly documented in the Exim configuration and then trying to remember them (it turns out that I got at least one case wrong in yesterday's entry). Possibly this will cause us to regularize some of them. Probably we won't drop any.

Written on 19 December 2017.
« What file types we see inside ZIP archives with only a single file in email
I feel that Firefox forks that would be useful to me are doomed »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Dec 19 00:52:18 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.