2016-07-27
When 'simple' DNS blocklists work well for you
I've written about how we can divide DNS blocklists into 'simple' and 'complex' ones, where simple DNSBLs basically list things based on them sending spam or other bad stuff without trying to do more complex things like assess how much legitimate traffic also comes from the source. To put it one way, if a DNSBL lists one of GMail's outgoing SMTP servers because it sent some spam, it's almost certainly a simple one. I also said that rejecting email based on a simple DNSBL isn't necessarily a mistake, so it's time to explain that.
Suppose that you have a mail system that generally receives a low volume of legitimate email; for example, you might be operating a personal email server. Suppose that you also start getting spam. Spammers almost never go away, so your spam volume is very likely to trend up over time and reach a point where most of your incoming email is spam. In this environment, a listing in a simple DNSBL is a fairly strong confirmation signal that this new email is really spam. It's much likely that you're getting spam email from an IP that's been detected as spamming than that an innocent person has chosen to send you legitimate email from an IP that also sent spam and got listed in the DNSBL. The latter could happen, but the odds are low.
We've sort of seen this before. If the legitimate email rate is low and the DNSBL's 'false positive' rate on it is also low, the odds that a positive signal from the DNSBL means that an email is spam is very high. You can make the odds even higher by whitelisting known good sources.
(Of course anti-spam precautions aren't evaluated purely on percentages; the absolute number of legitimate messages blocked matters. Here the low volume helps, as there just aren't that many legitimate emails to get blocked.)
Similar logic can be applied to a lot of anti-spam heuristics; many
things look good when they're dealing with a stream of email that's
mostly or almost entirely spam. Block on bad EHLO greetings? Sure,
why not, especially since GMail and the other big people do generally
get those things right.
(GMail will send you spam too, of course, but statistically a new legitimate sender is much more likely to be using GMail or one of the other big places than an email server in the middle of nowhere. And yes, there are downsides to too many people adopting this sort of attitude to both heuristics and new mail sending machines in surprising places; ask anyone trying to send personal email from a new small home mail server and get it accepted by places.)
2016-07-06
It turns out that viruses do try to conceal their ZIP files
One of the interesting things that happens when you start to log information about what types of files your users get in email is that you get to discover certain sorts of questionable things that people actually do ('people' in a loose sense). Here's one interesting MIME part, extracted from our logs:
attachment application/octet-stream; MIME file ext: .jpeg; zip exts: .js
The 'attachment' bit is the Content-Disposition and the nominal
MIME type comes from the Content-Type. The MIME filename (which
came either from Content-Type or Content-Disposition) had a .jpeg
extension; however, our logging program found that the attachment
actually was a ZIP file with a single .js file inside it, not a
JPG image. Our anti-spam software later
identified it as malware.
(I didn't set out to write an attachment type logging program that
did content sniffing, but the Python zipfile module has a very
convenient function for it and
it's much simpler to structure the code that way instead of trying
to maintain a list of file extensions and/or Content-Types that
correspond to ZIP files.)
I vaguely knew that any number of file formats were actually ZIP
files under the hood; there's .jar files, for example, and a
number of the modern '* office' suites and programs use ZIP as their
underlying format. Our file type logging program has peered inside
any number of those sorts of attachments (as well as inside regular
.zip attachments). I also knew that it was theoretically possible
for bad actors to try to smuggle ZIP files through as some other
file type. But I didn't expect to see it, especially so fast.
(To be fair, most malware does seem to stick to .zip files,
not infrequently even with real MIME Content-Types. I suspect
that malware wants to make it easy for people to open up the
bad stuff that it sends them.)
PS: Hopefully no real content filtering software is fooled by this sort of transparent ruse. It's not as if ZIP archives are hard to detect. Sadly, that (some) malware does this kind of thing makes me suspect that some important software actually is defeated by it.
PPS: All of the cases seem to be from the same malware run, based on how they all happened today and have various other indicators in common.