What email messages to not send autoreplies to (late 2018 edition)
Our mail system is very old. Much of the current implementation
dates back about ten years, when we moved it to be based on Exim,
but the features and in some cases the programs involved go back
much further than that. One part of it is that we have a local
version of the venerable Unix
vacation program, and this local
version goes back a very long time (some comments say it is the
version, which would date it to 1990). By now our version is ancient
and creaky, and in general we're no longer enthused about maintaining
locally hacked versions of software, so we need to move to using
the standard Ubuntu version.
Unfortunately, our local version has some differences from the
standard one; it supports an additional command line option that's
used by an unknown number of people, and we long since made it not
autoreply to some additional things over what the standard
already ignored. To deal with both problems we're using the standard
computer science solution of adding another layer of indirection, in
the form of a cover script. One of the jobs of this cover script is
knowing what not to autoreply to (beyond extremely obvious things
like messages that we detect as spam).
When I started out writing the cover script, I thought this would be simple. This is not the case, as what not to autoreply to has gotten a little bit more complicated since 1990 or so; for instance, there is now an actual RFC for this, RFC 3834. Based on Internet searches and this very helpful Superuser answer, the current list appears start with:
Precedence:header value of 'bulk', 'list', or 'junk'; this is the old standard way.
Auto-submitted:header value of anything but 'no', which is the RFC 3834 standard way. In practice, this is effectively 'if there is an
Auto-submittedheader'; I searched through a multi-year collection of email and couldn't find anything that used it with a 'no' value.
X-Auto-Response-Suppress:header with effectively any value, although Microsoft's official documentation says that a value of 'none' means that you can auto-reply. In practice that multi-year collection of email contains no cases with the 'none' value.
(Energetic people can look for only 'All' or 'OOF', but matching this is annoying and, again, my mail collection shows no hits for anything without one or the other of those.)
- Any of the various headers that indicate a mailing list message,
List-Unsubscribe:. In a sane world you would only need to look for one of them, but this is not a sane world (especially once spammers get involved); I have seen at least one message with only a
- A null (envelope) sender address, although of course any autoreplies to that aren't going to get very far. Generally you'll want to not autoreply to postmaster@ or mailer-daemon@, although it's not clear how much stuff gets sent out with such envelope senders.
In theory you could stop here and be nominally correct, more or less. In practice it seems clear that you want to do some additional matching on the sender address, to not auto-reply to at least:
- Definitely various variations on 'noreply' and 'donotreply' sender
addresses. You might think that people sending emails with these
sender addresses would tag them in various ways to avoid auto-replies,
but it is not so; for example, just yesterday Flickr sent me a
notification email about some important upcoming changes that came from
'firstname.lastname@example.org' and had none of those 'please do not
reply' header markers.
- Probably anything that appears to be an address that exists to collect
bounces, especially tagged sender addresses.
There are a bunch of patterns for these, where they start with
'bounce-' or 'bounce.' or 'bounce+' or 'bounces+', or come from
a domain that is 'bounce.<something>' or 'bounces.<something>'.
Just to be different, Google uses '@<something>.bounces.google.com'.
Some of these 'bounces' addresses are also tagged with various 'do not autoreply' headers, but not all of them. Since tagged bounce addresses are always unique, they'll generally always bypass
vacation's attempts to only send an autoreply notification every so often, which is one reason I think one should suppress autoreplies to them.
- Perhaps all detectable tagged sender addresses, especially repeated sources of them. The one that we've already seen in our logs is AmazonSES ones, some of which don't have any 'don't autoreply' headers. Perhaps there are some AmazonSES senders who should get vacation autoreplies, but I suspect that there are not that many.
(I'm sure that there are some senders who would like to get vacation autoreplies so they know that their email is sort of getting through. It's less clear that our users want those senders to know that, given some of the uses of AmazonSES.)
Possibly you also want to not autoreply to sender addresses with various generic local parts, such as 'root', 'www-data', 'apache', and so on. Perhaps you also want to include 'info', but that feels more potentially questionable; there might actually be a human who reads replies to that and cares about out of office things and so on.
(In general my view is that it's only useful to send autoreplies to actual people, and in some cases sending autoreplies to non-people addresses is at least potentially harmful. If we can establish fairly confidently that a given sender address is not a person, not sending vacation and out of office and so on autoreplies to it is harmless and perhaps beneficial. At the same time it's important not to be too aggressive, because our users do count on their autoreplies reliably telling people about their status.)
PS: In an extremely cautious world, you would not autoreply to anything that hadn't passed either strict SPF checks or strict DMARC policies. You can use DKIM too, but I think only if you carefully check that you're verifying a DKIM signature for the sender domain, because only then have you verified attribution to the domain. I rather expect that this is too strict to make users happy today, because it would exclude too many real people that send them email and so should get their autoreply messages.
Sidebar: My guess about non-human email that lacks these markers
One might wonder why email notifications and other similar large scale messages don't have some version of 'please do not autoreply' tags. My suspicion is that people have found that email without such tags is more likely to appear in people's inboxes on large providers like GMail and so on, while email with those tags is more likely to get dumped into a less frequently examined location.
If you're someone like Flickr (well, SmugMug, who bought Flickr) and really do have an important message that many Flickr members need to read, this leaves you with an unfortunate dilemma. On the whole I can't blame SmugMug for making the email choice that they did; with data at future risk, it is better to err on the side of getting more autoreplies than having people not see your message.
(In this view, the 'donotreply' email sender address is mostly there in the hopes that actual people will not hit 'reply' and send email back, email that will not have the desired effect.)
DKIM provides sender attribution (for both spam and not necessarily spam)
The presence of a valid DKIM signature on incoming email doesn't mean anything much about whether or not it's spam, or even if it comes from dedicated spam senders. Spammers can and do add proper DKIM signatures to their messages, and many legitimate senders don't use DKIM or don't have valid DKIM signatures, as our recent DKIM stats demonstrate. For that matter, some spam comes from legitimate places which DKIM sign all of their outgoing email (such as GMail). However, it has recently struck me that what a valid DKIM signature does provide is attribution.
If we receive a piece of email with a valid DKIM signature, the DKIM signature means that we can confidently attribute it to the signing domain. Either it was really sent by that domain or that domain has lost control over either or both of their DNS and their DKIM signing keys, and one of these is far more likely than the other. With a valid DKIM signature, all arguments related to the real sender and backscatter and so on are swept away; it was real email from the sending domain, period. In fact the sending domain went out of its way to make their email attributable to them.
This doesn't mean that the sending domain will accept replies and bounces to that email; far from it. But it does mean that the sending domain can't argue that they didn't send out the email and so are not socially obliged to accept replies. They really sent that email, in a way that provides undeniable attribution. Any refusal to accept replies is just a middle finger extended to other mail systems on the Internet (a fairly common middle finger, of course, because a lot of the modern Internet is defined by not caring about other people).
PS: It strikes me that this attribution may be one reason that large email providers such as GMail increasingly want DKIM signatures these days, because once you have definite attribution for incoming email you can do a number of things based on that with much higher certainty. And people sure can't argue with you about email 'not really coming from them'; they signed it.
(This realization was sparked by a discussion with Aneurin Price in comments in this recent entry. In a sense it's an obvious one, since DKIM's entire purpose is to validate email as coming from a specific source and the flipside of such validation is necessarily attribution.)
Some DKIM usage statistics from our recent inbound email (October 2018 edition)
By this point in time, DKIM (Domain Keys Identified Mail) has been around for long enough and enough large providers like GMail have been pushing for it that it has a certain decent amount of usage. In particular, a surprising number of sources of undesirable email seem to have adopted DKIM, or at least they add DKIM headers to their email. Our Exim setup logs the DKIM status of incoming email on our external MX gateway and for reasons beyond the scope of today's entry I have become interested in gathering some statistics about what sort of DKIM usage we see, who from, and how many of those DKIM signatures actually verify.
All of the following statistics are from the past ten days of full logs. Over that time we received 105,000 messages, or about 10,000 messages a day, which is broadly typical volume for us from what I remember. Over this ten day period, we saw 69,400 DKIM signatures, of which 55 were so mangled that Exim only reported:
DKIM: Error while running this message through validation, disabling signature verification.
(Later versions of Exim appear to log details about what went wrong, but the Ubuntu 16.04 version we're currently using doesn't.)
Now things get interesting, because it turns out that a surprising
number of messages have more than one DKIM signature. Specifically,
roughly 7,600 have two or more (and the three grand champions have
six); in total we actually have only 61,000 unique messages with
DKIM signatures (which still means that more than half of our
incoming email had DKIM signatures). On top of that, 297 of those
messages were actually rejected at SMTP time during
it turns out that if you get as far as post-DATA checks, Exim is
happy to verify the DKIM signature before it rejects the message.
The DKIM signatures break down as follows (all figures rounded down):
|3340||verification failed - signature did not verify (headers probably modified in transit)|
|2660||invalid - public key record (currently?) unavailable|
|790||verification failed - body hash mismatch (body probably modified in transit)|
|310||invalid - syntax error in public key record|
Of the DKIM signatures on the messages we rejected at SMTP time, 250 had successful verification, 45 had no public key record available, 5 had probably modified headers, and two were mangled. The 250 DKIM verifications for messages rejected at SMTP time had signatures from around 100 different domains, but a number of them were major places:
41 d=yahoo.com 18 d=facebookmail.com 13 d=gmail.com
(I see that Yahoo is not quite dead yet.)
There were 5,090 different domains with successful DKIM verifications, of which 2,170 had only one DKIM signature and 990 had two. The top eight domains each had at least 1,000 DKIM signatures, and the very top one had over 6,100. That very top one is part of the university, so it's not really surprising that it sent us a lot of signed email.
Overall, between duplicate signatures and whatnot, 55,780 or so of the incoming email messages that we accepted at SMTP time had verified DKIM signatures, or just over half of them. On the one hand, that's a lot more than I expected. On the other hand, that strongly suggests that no one should expect to be able to insist on valid DKIM signatures any time soon; there are clearly a lot of mail senders that either don't do DKIM at all, don't have it set up right, or are having their messages mangled in transit (perhaps by mailing list software).
Among valid signatures, 46,270 were rsa-sha256 and 15,960 were rsa-sha1.
The DKIM canonicalization (the '
c=' value reported by Exim) breaks down
51470 c=relaxed/relaxed 9440 c=relaxed/simple 1290 c=simple/simple 20 c=simple/relaxed
I don't know if this means anything, but I figured I might as well note it. Simple/simple is apparently the default.
The external delivery delays we see on our central mail machine
These days, our central mail machine almost always has a queue of delayed email that it's trying to deliver to the outside world. Sometimes this is legitimate email and it's being delayed because of some issue on the remote end, ranging from the remote end being down or unreachable to the destination user being over quota. But quite a lot of time it is for some variety of email that we didn't successfully recognize as spam and are either forwarding to some place that does (but that hasn't outright rejected the email yet) or we've had a bounce (or an autoreply) and we're trying to deliver that message to the envelope sender address, except that the sender in question doesn't have anything there to accept (or reject) it (which is very common behavior, for all that I object strongly to it).
(Things that we recognize as spam either don't get forwarded at all or go out through a separate machine, where bounces are swallowed. At the moment users can autoreply to spam messages if they work at it, although we try to avoid it by default and we're going to do better at that in the near future.)
Our central mail machine has pretty generous timeouts for delivering external email. For regular email, most destinations get six days, and for bounces or autoreplies most destinations get three days. These durations are somewhat arbitrary, so today I found myself wondering how long our successful external deliveries took and what the longest delays for successful deliveries actually were. The results surprised me.
(By 'external deliveries' I mean deliveries to all other mail systems, both inside and outside the university. I suppose I will call these 'SMTP deliveries' now.)
Most of my analysis was done on the last roughly 30 full days of SMTP deliveries. Over this time, we did about 136,000 successful SMTP deliveries to other systems. Of these, only 31,000 took longer than one second to be delivered (from receiving the message to the remote end accepting the SMTP transaction). That's still about 23% of our messages, but it's still impressive that more than 75% of the messages were sent onward within a second. A further 15,800 completed in two seconds, while only 5,780 took longer than ten seconds; of those, 3,120 were delivered in 60 seconds or less.
Our Exim queue running time is every five minutes, which means that a message that fails its first delivery or is queued for various reasons will normally see its first re-delivery attempt within five minutes or so. Actually there's a little bit of extra time because the queue runner may have to try other messages before it gets to you, so let's round this up to six minutes. Only 368 successfully delivered messages took longer than six minutes to be delivered, which suggests that almost everything is delivered or re-delivered in the first queue run in which it's in the queue. At this point I'm going to summarize:
- 63 messages delivered in between 6 minutes and 11 minutes.
- 252 messages delivered in between 11 minutes and an hour.
- 24 messages delivered in between one and two hours.
- 29 messages delivered in over two hours, with the longest five being delivered in 2d 3h 22m, 2d 3h 14m, 2d 0h 11m, 1d 5h 20m, and 1d 2h 39m respectively. Those are the only five that took over a day to be delivered.
We have mail logs going back 400 days, and over that entire time only 45 messages were successfully delivered with longer queue times than our 2d 3h 22m champion from the past 30 days. On the other hand, our long timeouts are actually sort of working; 12 of those 45 messages took at least five days to be delivered. One lucky message was delivered in 6d 0h 2m, which means that it was undergoing one last delivery attempt before being expired.
Despite how little good our relatively long expiry times are doing for successfully delivered messages, we probably won't change them. They seem to do a little bit of good every so often, and our queues aren't particularly large even when we have messages camped out in them going nowhere. But if we did get a sudden growth of queued messages that were going nowhere, it's reassuring to know that we could probably cut down our expire times quite significantly without really losing anything.
An extravagant and dense piece of malware-laden email
In the process of looking through our mail system logs for my entry on phish spam with multiple tries, I stumbled over the following extravagant and apparently densely packed email message:
<ID> attachment application/zip; MIME file ext: .zip; zip exts: .jar; inner zip exts: .class .mf none; email is in-dnsbl rejected <ID> from email@example.com to <redacted>: identified virus: CXmail/JarZip-A, Mal/Jacksbot-A, Mal/Jeetrat-A, Troj/JavaBz-WB detail <id> Subject: [PMX:SPAM] [PMX:VIRUS] Kindly Quota
That's a single attachment with a relatively ordinary looking '.jar in .zip' (well, for malware), where Sophos identified no less than four different sorts of bad things. I wonder if every single .class file in that JAR had a different piece of malware in it.
(It also appears possible that Sophos identified the JAR file as a whole as being one sort of malware and then some pieces inside it as being additional sorts of malware. But all of this is opaque.)
At the time of this message, the IP address was in zen.spamhaus.org and cimexcuritibas.com was in dbl.spamhaus.org. Neither are true any more, so someone cleaned up something. We logged the message headers, but none of them have anything interesting except that the message was DKIM-signed by cimexcuritibas.com with a valid signature.
(This goes to show that valid DKIM signatures mean absolutely nothing about the quality of the email itself. I'm sure we all knew this already, but I like to provide examples every so often.)
PS: It appears that we don't receive any valid, accepted email that has a single .jar in a .zip by itself. It's possible that this is a case like singleton nested zipfiles, where we should just block all of these out of hand. On the other hand, if we did this I wouldn't get lovely log reports like this (we reject bad attachment types before we run them past Sophos PureMessage).
If one phish spam doesn't succeed, maybe another will
Here is another entry in the annals of spammers trying extra hard, to go with an earlier one. Recently, our mail system logged the following two interesting cases. Let's start with the first one:
<ID> attachment application/octet-stream; MIME file ext: .pdf <ID> attachment application/octet-stream; MIME file ext: .htm rejected <ID> from firstname.lastname@example.org to <redacted>: identified virus: Mal/Phish-A, Troj/PDFUri-FUP detail <ID> Subject: [PMX:SPAM] [PMX:VIRUS] Re: Document for Last Shipments
It seems our spammer is trying to get people two ways, trying both some malware and also a phish spam as a HTML attachment. At one level this feels like a sensible approach; if the recipient's system blocks the malware attack, maybe the recipient will still respond manually to the phish spam. But that seems to assume a world where people can have malware not work and be blocked without having big red alerts show up all over the entire email. Perhaps that is the case; if so, that's depressing.
Then there's the second and perhaps more interesting case:
<ID> attachment application/octet-stream; MIME file ext: .html <ID> attachment application/octet-stream; MIME file ext: .html rejected <ID> from 22.214.171.124/TAX@GOV.COM to <redacted>: identified virus: Troj/Phish-DFM, Troj/Phish-DGF detail <ID> Subject: [PMX:SPAM] [PMX:VIRUS] Tax Clearance Certificate
(This 'TAX@GOV.COM' spammer turns out to have been trying for a while from several sources.)
This prompted me to go looking through our logs in search of messages that were identified by Sophos as having multiple bad things in them in the hopes that I'd find a double-phish case. Sadly I wasn't able to find any, and most of them were less interesting than these ones. There was one exception, but that's going to be another entry.
Some malware apparently believes in covering its bases
Today our system for logging email attachment type information caught something interesting. Here's the important log messages:
<MSGID> attachment application/rtf; MIME file ext: .rtf <MSGID> attachment application/zip; MIME file ext: .zip; zip exts: .pdf .rtf .xlsx rejected <MSGID> from email@example.com to <redacted>: identified virus: CXmail/Rtf-E, Exp/20180802-B
Exp/20180802-B is apparently an OLE2 based exploit using CVE-2017-11882, which appears to often be RTF-based (cf). This opens up the interesting and amusing possibility that both attachments are RTF based attacks (with the .pdf and .xlsx included in the .zip as either cover or supporting elements), and perhaps that they're the same RTF file. At the very least, this malware seems to believe in covering its bases; maybe you'll open a direct RTF attachment, or maybe you'll unzip the ZIP archive and use something in that.
We actually got several copies of this to various different local addresses, all apparently coming directly from this IP address (ie with no additional Received: headers) and all with the same 'Subject: Payment Advice'. The IP address in question isn't currently in the CBL or in Spamhaus ZEN, although it is in b.barracudacentral.org.
In a further interesting development, looking at our logs in more detail showed that there's actually a second run from the same IP an hour or so earlier, with a HELO of '163.com', a MAIL FROM of 'firstname.lastname@example.org', and a Subject of 'Purchase Inquiry RG LLC'. This run was detected as the same two types of malware, but it has a different mix of attachment types:
attachment application/pdf; MIME file ext: .pdf attachment application/octet-stream; MIME file ext: .xlsx; zip exts: .bin .png .rels .vml .xml none
This may mean that the first attachment is basically a cover letter and it's the second attachment where all the malware lurks.
Sidebar: More spammers covering their bases
In the past nine days or so, we've also seen:
attachment application/msword; MIME file ext: .doc; zip exts: .rels .xml none attachment application/vnd.ms-excel; MIME file ext: .xls; zip exts: .rels .xml none rejected [...] identified virus: CXmail/OleDl-AD, CXmail/OleDl-AQ
(with the Subject of 'Re: August PO #20180911000'.)
The idea of putting together two different OLE-based attacks in two different documents amuses me. It's kind of brute force, and also optimistic (since you're hoping that neither is recognized and thus blocks your email).
attachment application/msword; MIME file ext: .doc attachment application/pdf; MIME file ext: .pdf rejected [...] identified virus: CXmail/RTF-F, Troj/20170199-P
And then there's what is probably a case of 'let's throw two phish attempts into one email':
attachment text/html; MIME file ext: .html attachment text/html; MIME file ext: .html rejected [...] identified virus: Troj/Phish-CZV, Troj/Phish-DAG
As I discovered once we started logging attachment types, our commercial anti-spam system identifying something as having phish 'malware' probably means it's in the attachments. This one had a Subject of 'Details Attached'. I bet they were.
A recent spate of ZIP attachments with everything
Our program for logging email attachment type information looks inside
.jar archives, including one level of nesting. Often what we see in this is routine, with
basically the sort of content you'd expect from either routine stuff
or malware, but recently we've been seeing zip archives that are
just stuffed with at least one of almost any file extension you can
think of. A few days ago we logged an extreme example:
1fnnAC-0003dZ-EP attachment application/zip; MIME file ext: .zip; zip exts: .jar; inner zip exts: .abc .abl .acc .ach .adc .adz .afd .age .ago .agy .aht .ake .ala .alp .and .ans .aob .aor .app .apt .ara .ary .aud .aus .ave .axe .baa .bag .bap .bat .bde .bet .bin .bis .bkg .boe .bra .bsh .buz .bye .cai .cal .cat .caw .cdg .chm .cit .class .cli .clo .col .cop .cpl .crc .crs .cst .ctg .cto .cup .cwt .dad .dbl .dcb .der .det .dew .dey .dig .dil .dks .dur .dwt .dye .eft .ego .elb .elm .els .emf .emm .emu .err .esd .esq .ext .eyn .fax .fbi .fcs .fee .fei .fem .ffa .fgn .fig .flb .fly .foe .fog .fud .gab .gae .gal .gas .geb .gig .gin .gio .goa .gob .god .gon .goo .gox .gtc .gun .had .hah .hak .hao .hat .hau .hcb .hcl .hed .heh .hen .hes .hia .hip .hir .hld .hoc .hoe .hts .hug .hye .ibo .ide .ihp .ijo .ilk .imu .ing .ipr .iqs .ire .iwa .iyo .jah .jap .jay .jct .jem .jud .jur .kat .kaw .kay .key .khi .kop .kor .kos .kph .kyl .lab .lap .lcm .lea .lek .les .lib .lid .lit .llb .lou .lub .lxx .mao .map .maw .meu .mf .mix .mks .mog .mor .mot .mph .mus .nee .nef .nei .nep .nut .oak .obb .ofo .oki .one .oni .ops .ora .our .pan .pap .par .paw .pax .pay .pdq .peh .pep .pia .pie .pig .pit .pks .poh .pos .pot .ppa .pps .pre .pry .psi .pwr .pyr .rab .ram .rat .raw .rct .ref .reg .res .rfs .rig .rim .rix .rld .roc .roi .rpm .rut .rux .rwd .rwy .rye .sab .sau .sds .sed .sei .sel .sew .she .shr .sie .sil .sim .sip .six .sny .soe .sou .soy .sqq .stg .sum .sur .syd .tar .tat .tay .ted .tef .tem .tng .ton .tou .twa .udo .uns .urb .urn .uti .vac .vil .von .vum .wab .wae .wea .wop .wot .wro .wud .xii .xiv .xxi .xxv .xxx .yam .yay .yea .yeo .yer .yez .yoe .yrs .yun .zat .zen .zho .zig .zip .zod
(We deliberately log file extensions inside zip archives in alphabetical order, so it may well have had a much different order originally.)
This particular message was detected by Sophos PureMessage as 'Mal/DrodZp-A', which may be a relatively generic name. The Subject: of the message was the relatively generic 'Re: Invoice/Receipt', and I don't know what the overall MIME filename of the .zip was claimed to be. We've received a bunch of very similar attachments that were just .jars (not .zip in .jar) with giant lists of extensions. Many of them have been rejected for containing (nominal) bad file types, and their MIME filenames have been things like 'ORIGIAL SHIPPING DOCUMENTS.qrypted.jar' and "0042133704 _ PDF.jar".
(It's possible that these direct .jars would also be detected as Mal/DrodZp-A, but we reject for bad file types before we check for known viruses.)
I doubt that the attachment had genuine examples of these file
types, especially things like
.rpm (RPM packages) and
(Nikon camera RAWs, which are invariably anywhere from several
megabytes to tens of megabytes for the latest high-resolution Nikon
DSLRs). I'm sure that the malware has some reason for doing this
spray of files and file extensions, but I have no idea what it might
be. If there are some anti-virus products that give up if a .jar
has enough different file extensions in it, that's kind of sad
(among other things).
Sadly for any additional filtering we might considering doing, I suspect that the dangerous parts of this were in the actual Java stuff (eg the .class files) and everything else is distraction. It'd be somewhat interesting to pick through a captured sample, because I am curious about what's in all of those files (or if they're just zero-length ones put in to pad things out) and also what file names they have. Did the malware make up some jumble of random file names, or is it embedded a message in them or something clever? I'll never know, because it's not important enough to bother doing anything special for.
Today I saw a spammer parasite itself on another spammer (probably)
We have an account request management system that sends out email to people as part of its activities, using an administrative email address as the sender address. We don't directly expose the address anywhere on our web pages, but it winds up in people's email address books when they get email from it and so years ago it leaked into the hands of spammers and we started to get occasional spam to it. Today it got two such pieces of email, both from and through Mailchimp and both theoretically sent by 'naturaful.com'.
The first one went like this:
From: Support Naturaful <email@example.com>
Subject: INV04732 from Naturaful Support
Date: Thu, 26 Jul 2018 14:49:59 +0000
View invoice ([an-URL])
$ 1750.00 due 30 July
The URL went off to a random and likely hijacked URL on a random website, or at least it tried to; it was probably broken (one part of it was a literal '[UNIQID]' as a query parameter). This was clearly basically a phish spam, and it appears to have tried to redirect from the initial URL to an invoice page on 'xerotransfers.com', where it would presumably have tried to extract some sort of payment from visitors.
It was followed less than two hours later by a second email message, a rather flustered one:
From: Support Naturaful <firstname.lastname@example.org>
Subject: We're Sorry - Please Ignore Email About Invoice
Date: Thu, 26 Jul 2018 16:35:10 +0000
Please ignore the last email about a large invoice amount. .
Please do not click on the button or pay any money.
Any links that do not have [list of domains] is not our website. Any sales of Naturaful products are paid on our website and you don't owe anything after.
Please ignore the last email, we're currently cleaning up our database and ensuring this does not happen again.
Security is our primary concern.
What appears to have happened here is that our administrative address was bought by naturaful.com and added to a mailing list that they were going to use to send out spam (through Mailchimp). Before they could use their shiny new mailing list to send out their own spam, another spammer came by and exploited a security vulnerability of some sort to hijack naturaful.com's mailing list and Mailchimp account (and 'good' name) to send out their own spam.
As a bonus prize, naturaful.com claims to be in Canada, which makes what they're doing completely unambiguously illegal under our anti-spam law. The odds are that the government will never get around to doing anything to them, but one can always hope. In the mean time, neither these people nor Mailchimp are going to be successfully sending email to this particular administrative address.
(This elaborates on my tweet.)
The irritatingly many executable formats of Windows
So I tweeted:
It's impressive how many different executable file formats Windows has.
(I care because our email system wants to reject top-level attachments that are Windows 'executables' and boy is the list getting long.)
I put 'executables' into quotes in this tweet because many of these file formats (or more exactly file types) are not binaries; instead they're text files that Windows will feed to various things that will interpret them in ways that you don't want. Typical extensions that we see as top level attachments (and reject at SMTP time) include .lnk, .js, .bat, .com, .exe, .vbs, and .vbe. Some of these are encoded binaries, while others are text.
We mostly do this checking and rejection based on MIME file extensions, partly because it's easiest. Also, for the ones that are text (and at least some of the ones that are encoded binaries), my understanding is that what makes them dangerous on a Windows machine is their file extension. A suitable text file with the extension ".txt" will be opened harmlessly in some editor, while the same file with the extension ".js" will generally be run if you try to open it.
(We do some file content sniffing to look for and reject unlabeled Windows executables, ie things which libmagic will report as type 'application/x-dosexec'. As you can see here, there are actually a lot of (sub)formats that map to this.)
We've historically added extensions one at a time as we run into them, usually when our commercial anti-spam system rejects one of them as being a virus (this time, several .pif files being rejected as 'W32/Mytob-C'). Possibly this is the wrong approach and we should find a master list somewhere to get almost all of this over with at once (perhaps starting from GMail's list of blocked file types). On the other hand, there's some benefit to passing up rejections, especially if you don't actually seem to need them. If we never see file types, well, why block them?
(I'm not completely convinced by this logic, by the way. But I'm lazy and also very aware that I could spend all my time building intricate anti-spam precautions of dubious actual benefit.)