Wandering Thoughts archives

2019-03-21

What sorts of good email attachments our users get (March 2019 edition)

Yesterday I looked at the types of attachments we see in malware email. Of course if we're considering blocking some of them, it's not enough to consider just what types we see in malware; we also care about what types we see in legitimate email (or at least in email that is as close to legitimate as we can manage). I did some stats for this a year ago, in the April 2018 edition, but this time around I'm going to be doing the stats slightly differently since I want to compare relatively directly to yesterday's data. Like yesterday, this is over the previous ten weeks, but a slightly different ten weeks (the relevant systems roll their weekly logs at different times).

Over the past ten weeks, we had 54,076 file attachments in 39,607 email messages that were not from DNSBL-listed sources, not identified as spam or virus-laden, and not rejected for other reasons. This is about ten times as many as we had malware attachments, which is either good or bad depending on your perspective. 98.5% of them had MIME filename information, and out of those the most popular file extensions were:

 30462  .pdf
  4210  .jpg
  3688  .docx
  1939  .png
  1773  .ics
  1339  .xlsx
  1009  .txt
   725  .html
   682  .doc
   640  .zip

If I reprocess the data to count how many messages had any particular type of file attachment, the data breaks down this way:

 23789  .pdf
  3177  .docx
  3075  .jpg
  1757  .ics
  1221  .png
  1172  .xlsx
   744  .txt
   690  .html
   629  .asc
   602  .zip
   595  .doc

It is probably not surprising that the image formats drop in this re-ranking, since it's likely common to attach several images to a single message. To my surprise, a number of messages had multiple .zip file attachments, which is why the .zip numbers drop. Multiple .doc and .docx attachments are relatively common.

(In the 'things that make me raise my eyebrows now that I'm looking at them' category, there was one message with 24 .wmz attachments. It came from a 'marketing@<domain>' address, so maybe it was genuine and just, well, marketing.)

Basically all of these file types are unsurprising in our environment (academic computer science). All of the .asc files are PGP stuff (and have appropriate MIME types); I'm a bit surprised that we see so much of it in our email, but then some of this email is things like update notifications from Ubuntu and other sources that's PGP signed. Use of .p7s is not too much below the use of .asc, at 588 attachments. I am a bit surprise to see so many .html attachments, but perhaps some of that is mail sending programs improperly marking HTML parts as attachments instead of inline content.

Nothing particularly stands out about the contents of .zip files and ZIP archives in general, so I'm going to skip any extensive analysis or discussion of them.

At this point it's useful to cross-compare some suspicious file types from yesterday that haven't already been mentioned to see how many legitimate versions of them we see:

   444  .xls
    18  .rar
     1  .iso
     1  .docm

We clearly can't reject .xls file attachments, but it seems likely we could reject .docm and .iso attachments. I was going to say that we could probably reject .rar file attachments as well, but then I took a second look at our data. We could read the RAR file list for all but four of those .rar attachments, and all of the file types in them look legitimate; on closer inspection (eg of source and destination information), even the remaining four look good. It looks like some people just prefer RAR to ZIP, which I can't blame them for.

(The good news version of this finding is that our commercial anti-spam system is apparently very good at finding bad stuff in .rars, since no bad ones seem to have slipped past it.)

GoodAttachmentTypes-2019-03 written at 20:25:48; Add Comment

2019-03-20

The types of attachments we see in malware email (March 2019 edition)

Back in mid 2017 I wrote about the types of attachments we saw then in malware-laden email. Today, for reasons beyond the scope of this entry, I feel like looking at our current numbers on this, based on the previous ten weeks of activity. This does not include the slowly but steadily growing collection of attachment types we reject immediately, but it does include 'malware' that is a phish spam in an actual attachment, because that's what our commercial anti-spam system does. As we will see, this is actually a large category of what we detect as 'malware'.

Over 99% of the detected malware attachments had MIME filenames. Out of the 5622 attachments with filenames, the most common file extensions were:

  3008  .html
  1134  .doc
   536  .xlsx
   246  .rar
   245  .iso
    60  .docm
    58  .txt
    57  .docx
    44  .zip
    36  .xls

More than half of these attachments were in messages detected as phish (more or less 55%, as it turns out). However, not all of the phish spam used .html attachments, or at least not directly; instead, it breaks down like this:

  3008 MIME file ext: .html
    58 MIME file ext: .txt
    23 MIME file ext: .zip
     6 MIME file ext: .jpg
     3 MIME file ext: .png
     1 MIME file ext: .htm

All of those .zip attachments actually contain a single .html file. We've seen this sort of single file ZIP smuggling before (1, 2) and now reject it outright for certain file types. We probably don't want to extend that to .html files, but it's slightly tempting.

Out of all of the various things that detect as ZIP archives (which is a lot more than .zip file attachments), there is no particularly dominating set of contents. We do see a certain number of ZIP archives that contain just a single .jar or a .jar plus a .txt, but the absolute numbers are too low to consider a 'reject on sight' policy for them (especially as our users may actually want to get .jars every so often).

My overall conclusion from this is that we don't really have any additional smoking gun file attachment types that we could argue for automatically rejecting on sight. We could raise the argument for .rar and .iso, but they are only 4% or so of the attachments in general. Anyway, this is only half the story; to really ask this question, we need to look at what sort of legitimate attachments our users get and that's another entry.

(Some but not very many messages detected with malware had multiple attachments. I'm not currently interested enough to do a breakdown of what types those messages use. For our purposes, any 'bad' file type that's commonly seen in malware laden email is suspect regardless of whether or not it actually contained the malware.)

MalwareAttachmentTypes-2019-03 written at 19:47:06; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.