On classifying phish spam as malware, an update

October 24, 2016

Back a number of years ago I noted that our commercial anti-spam filter was counting some varieties of phish spam as 'viruses', and I wrote some thoughts on why this might make sense. I now think that I was partly wrong about some of why the filter was acting this way. What's happened since then is that we now log some information about the structure of incoming messages as part of logging MIME attachment type information, which has given me the opportunity to see more information about the structure of many of these messages.

So here is a typical entry from our logs of the rejection, giving the information the anti-spam filter gave us:

rejected 1byktC-00047A-8j from 115.79.140.79/dofollow.backlinks@mg-dot.cn to <redacted>: identified virus: Mal/Phish-A

And here is the MIME attachment type information for the same message:

1byktC-00047A-8j attachment application/octet-stream; MIME file ext: .html

That's right: as far as I can tell, all of the phish spam being rejected this way has had a .html attachment. This sample was in a MIME multipart/mixed structure; the other part of parts of the structure were something we consider uninteresting and didn't log.

To me, this puts a somewhat different spin on our commercial anti-spam filter detecting phish spam. The entire purpose of its virus detecting side of things is to look at attachments and detect bad stuff (and then strip it out). Should it pass up detecting phish stuff in attachments, just because that's a different sort of bad stuff than it normally looks for?

(Since you can embed other sorts of malware in .html attachments (and people do), the virus detecting side already has to look at such attachments.)

There's still a conscious choice here to include phish as part of the 'malware' that the anti-virus detection looks for, but I think it's a more natural thing to do this to attachments that the software is already scanning for other things. It's less of a special case for both detection and, presumably, for stripping out these attachments as it does for other virus-contaminated attachments.

PS: Sophos's detailed information page on this label does specifically mention that these web pages are often sent as (spam) attachments.


Comments on this page:

By Jukka at 2016-10-27 04:10:42:

But does it matter? I mean whether it is HTML with some usual web stuff or a traditional virus, you want it blocked, right?

This said, tracking MIME identifications sounds like a decent idea for gaining some insights. But then again, have there been vulnerabilities in file(1) etc.?

By cks at 2016-11-01 17:05:17:

Belatedly: it matters a bit for us for local reasons, and because of what our commercial anti-spam filter does with things that it specifically identifies as viruses. Users get to opt out of spam filtering but not out of virus filtering and rejection, and the spam filter always strips out things that it thinks are viruses. So if it ever winds up misclassifying something as a 'virus', that email is never getting through intact, whereas an email that's misclassified as spam is theoretically recoverable.

(In practice it's often not, since many users throw away messages scored as spam.)

As for risks of identifying MIME attachments, for the most part we're simply using the information that's already in the MIME headers instead of trying to parse the actual file contents. It's certainly possible that someone might find a way to target us through the program we're using here, but my personal view is that our commercial anti-spam filter actually presents bigger risks since it's far more widely used (and we know it has exciting issues, as apparently do many commercial anti-virus systems).

Written on 24 October 2016.
« How I've wound up being one of the people who don't update IoT firmware
What I'm doing to use a Yubikey in Fedora 24's Cinnamon desktop environment »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Oct 24 21:43:27 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.