Some people feel that all permanent SMTP failures are actually temporary
It all started with a routine delivery attempt to my sinkhole SMTP server that I use as a spamtrap:
remote 22.214.171.124:36462 at 2017-07-03 13:30:28 220 This server does not deliver email. EHLO mail.travelshopnews7.com [...] MAIL FROM:<email@example.com> 250 Okay, I'll believe you for now RCPT TO:<redacted@redacted> 250 Okay, I'll believe you for now DATA 354 Send away [...] . <end of data> 554 Rejected with ID 433b5458d9d3e8a93020aca44406d2ec1d8ba82a QUIT 221 Goodbye
That ID is the hash of the whole message and its important envelope
information (including the sending IP). So far, so normal, and these
people stood out in a good way by actually
QUITing instead of just
dropping the connection. But then:
remote 126.96.36.199:39084 at 2017-07-03 13:37:10 220 This server does not deliver email. EHLO mail.travelshopnews7.com [...] 554 Rejected with ID 433b5458d9d3e8a93020aca44406d2ec1d8ba82a
They re-delivered the exact same message again. And again. And
again. In less than 24 hours (up to July 4th at 9:28 am) they did
21 deliveries, despite getting a permanent refusal after each
At that point I got tired of logging repeated deliveries for the
same message and put them in a category of earlier blocks:
remote 188.8.131.52:50676 at 2017-07-04 10:38:10 220 This server does not deliver email. EHLO mail.travelshopnews7.com [...] MAIL FROM:<firstname.lastname@example.org> 550 Bad address RCPT TO:<redacted@redacted> 503 Out of sequence command [...]
You can guess what happened next:
remote 184.108.40.206:34094 at 2017-07-04 11:48:36 220 This server does not deliver email. EHLO mail.travelshopnews7.com [...] MAIL FROM:<email@example.com> 550 Bad address RCPT TO:<redacted@redacted> 503 Out of sequence command [...]
They didn't stop there, of course.
remote 220.127.116.11:53824 at 2017-07-13 15:37:31 220 This server does not deliver email. EHLO mail.travelshopnews7.com [...] MAIL FROM:<firstname.lastname@example.org> 550 Bad address RCPT TO:<redacted@redacted> 503 Out of sequence command [...]
Out of curiosity I switched things over so that I'd capture their message again and it turns out that they're still sending, although they've now switched over to trying to deliver a different message. Apparently they do have some sort of delivery expiry, presumably based purely on the message's age and totally ignoring SMTP status codes.
(As before they're still re-delivering their new message despite the
DATA permanent rejection; so far, it's been two more deliveries
of the exact same message.)
These people are not completely ignoring SMTP status codes, because they know that they didn't deliver the message so they'll try again. Well, I suppose they could be slamming everyone with dozens or hundreds of copies of every message even when the first copy was successfully delivered, but I don't believe they'd be that bad. This may be an optimistic assumption.
(Based on what shows up on www.<domain>, they appear to be running something called 'nuevoMailer v.6.5'. The program's website claims that it's 'a self-hosted email marketing software for managing mailing lists, sending email campaigns and following up with autoresponders and triggers'. I expect that their view of 'managing mailing lists' does not include 'respecting SMTP permanent failures' and is more about, say, conveniently importing massive lists of email addresses through a nice web GUI.)
LinkedIn is still trying to send me email despite years of rejections
Back in 2014 I wrote about LinkedIn sending me invitation spam emails and how they wanted me to agree to their terms of service (and join LinkedIn) in order to 'unsubscribe' from them. Of course I didn't do that; instead, as usual, I arranged to have all future email from LinkedIn to me to be rejected during the SMTP conversation on our external MX gateway (using one of our anti-spam features). Then I put the whole thing out of my mind.
You can probably guess what has happened since then. It's now
closing in on three years that I've been rejecting all such LinkedIn
email, and LinkedIn still attempts to send me some every so often
on a semi-regular basis. I have no idea what's actually in the
email, since the external MX gateway rejects it at
RCPT TO time
(and LinkedIn uses completely anonymous
MAIL FROM addresses), but
I suspect that it's more invitations.
Persistently sending email to addresses that fail at
RCPT TO time
makes LinkedIn's behavior functionally indistinguishable from
spammers. Spammers ignore
RCPT TO and other mail failures; so
does LinkedIn. Spammers will send to dead addresses for years.
LinkedIn? Check. I am sure that LinkedIn will claim that it has
good reasons for its behavior, and perhaps it will even allege that
it is merely doing the will of its users. It doesn't really matter.
When you walk like a duck and quack like a duck, people who don't
want ducks don't really care what you actually are (cf).
I believe that LinkedIn's behavior is illegal in Canada under our anti-spam legislation. I was going to say that this exposes LinkedIn to potential legal risks now that it's 2017 and the legislation is fully in force, but it turns out that the government suspended the right of private action recently. Since Canada is a loser-pays country for civil lawsuits, suing LinkedIn over this would always be risky, but now only the government can take them to court and I don't think that that's very likely.
(On the other hand, according to the website the government apparently has taken action against some big Canadian corporations over their spam, oops, 'marketing email'. So who knows.)
PS: These days there appears to be a LinkedIn unsubscribe page that doesn't immediately demand that you log in to LinkedIn. I haven't tried it; to put it one way, I don't particularly believe that leopards actually change their spots. I have no trust for LinkedIn at this point and thus no desire to actively provide them with any email addresses.
The TLDs of sender addresses for a week of our spam (June 2017 edition)
Once upon a time the Internet only had a few non-country top level domain names. Then that changed. Mostly these new TLDs get used for websites, but every so of people use them for email. Generally the stereotype is that it's mostly spammers using these new TLDs, so I thought it would be interesting to look at eight days worth of logs from our commercial anti-spam system to see what the TLDs of sender addresses looked like for messages that were scored as spam and messages that weren't.
So here are the top ten TLDs from email scored as spam, with the percentage of our spam-scored email that had a sender address in that TLD and what percent of the TLD's overall email the spam represents.
|TLD||% of total spam||spam as % of TLD|
We can immediately see that
.bid does terribly and
.us is not
doing so well. The
.bid spam comes from multiple domains and
probably multiple spammers (there are at least two or three patterns
in how the sender addresses are formed).
.info is close to as
.us, but it's a much smaller percentage of the email.
.us spam seems to be a mix of compromised
random domains, and active spammer domains. The
.info spam is
multiple domains but might be mostly one spammer.
The high popularity of
.com in spam sender addresses surprises
me, as does how much of
.com email is spam. Bear in mind that
we're a university department (and in Canada), so we probably
exchange much less normal email with
.com places than most
However, the new TLDs are not particularly popular with spammers. Even if I look all the way down in the data, it's dominated by country codes with only a few new TLDs in small quantity:
|new TLD||% of spam|
You get the idea. I haven't shown 'spam as a percentage of the TLD's
email' here because it's mostly 100% and the times when it's not,
it may be because of mis-scoring (the absolute numbers are very
small, so it doesn't need much mis-scoring to show up as an appreciable
.party is under a hundred messages over the eight days of
.biz sender addresses are only 79% spam as
scored by our system.
Pleasingly, there were exactly 200 different TLDs used in the logs (or 199 if you exclude the null sender, which was 0.3% of the spam and 56% spam).
Plan for manual emergency blocks for your overall mail system
Last year, I wrote about how your overall anti-spam system should have manual emergency blocks. At the time I was only thinking about incoming spam, but after some recent experiences here, let me extend that and say that all entry points into your overall mail system should have emergency manual blocks. This isn't just about spam or bad mail from the outside, or preventing outgoing spam, although those are important things. It's also because sometimes systems just freak out and explode, and when this happens your mail system can get deluged as a result. Perhaps a monitoring system starts screaming in email, sending thousands of messages over a short span of time. Perhaps someone notices that a server isn't running a mailer and starts it, only to unleash several months worth of queued email alerts from said server. Perhaps some outside website's notification system malfunctions and deluges some of your users (or many of them) with thousands of messages.
(There are even innocent cases. Email between some active upstream email source (GMail, your organization's central email system, etc) and your systems might have clogged up, and now that the clog has been cleared the upstream is trying to unload all of that queued email on you as fast as it can. You may want some mechanisms in place to let you slow down that incoming flood once you notice it.)
We now have an initial set of blocks, but I'm not convinced that they're exactly what you should have; our current blocks are partly a reaction to the specific incidents that happened to us and partly guesswork about what we might want in the future. Since anticipating the exact form of future explosions is somewhat challenging, our guesswork is probably going to be incomplete and imperfect. Still, it beats nothing and there's value in being able to stop a repeat incident.
(Our view is that we've built are some reasonably workable but crude tools for emergency use, tools that will probably require on the spot adjustment if and when we have to turn them on. We haven't tried to build reliable, always-on mechanisms similar to our anti-spam internal ratelimits.)
We have a reasonably complicated mail system with multiple machines running MTAs; there's our inbound MX gateway, two submission servers, a central mail processing server, and some other bits and pieces. One of the non-technical things we've done in the fallout from the recent incidents is to collect in one spot the information about what you can do on each of them to block email in various ways. Hopefully we will keep this document updated in the future, too.
(You may laugh, but previously the information was so dispersed that I actually forgot that one blocking mechanism already existed until I started doing the research to write all of this up in one place. This can happen naturally if you develop things piecemeal over time, as we did.)
Sidebar: Emergency tools versus routine mechanisms
Some people would take this as a sign that we should have always-on mechanisms (such as ratelimits) that are designed to automatically keep our mail system from being overwhelmed no matter what happens. My own view is that designing such mechanisms can be pretty hard unless you're willing to accept ones that are set so low that they have a real impact in normal operation if you experience a temporary surge.
Actually, not necessarily (now that I really think about it). It is in our environment, but that's due to the multi-machine nature of our environment combined with some of our design priorities and probably some missing features in Exim. But that's another entry.
A 'null MX' is also useful for blocking forged senders from non-email domains
When I first considered the use of a 'null MX', I was only thinking of it as a way of blocking email to hosts that don't get email (I had a special case that made some dedicated spammer behavior unusually irritating). However, there is another useful case, and that's domains that don't send email but do get forged on spam.
A while back I wrote about a persistent phish spammer that consistently sends email using the forged sender email address of 'email@example.com'. As it happens approject.com seems to be a parked domain, with a 'this domain may be for sale' website and nothing else visible. If this is true, the owner of approject.com could cut off much of this forgery by publishing a suitable 'null MX' record in their DNS (especially now that it's an official standard, as I found out when doing research for this entry). Other owners of other parked domains could similarly cut off spam being forged in their names, and frankly there's a lot of it; spammers seem to love forging email as from domains like 'confirmation.com', 'verification.net', 'system.com', and so on.
(Some of those are not parked domains, mind you.)
Even without the new(-ish) null MX RFC, you can sort of get there today
for some sites through a suitable DMARC policy
and SPF records, but I think that probably requires more DNS fiddling
than a simple '
MX .' entry. Plus, it only applies to people who
actually use DMARC or SPF to reject message, which is not that many
people right now (partly because turning on DMARC or especially SPF
rejection has various often unpleasant side effects). The good news is
that using DMARC probably will insure that GMail and a few other big
places will reject the spammer email that is claiming to be from you.
(The more DNS fiddling is required, especially the more fiddling
that must contain the domain name or the like, the less likely it
is that owners of parked domains and similar things will go to the
bother. One attraction of '
MX .' is that it's completely generic.)
I don't know why this use for a null MX standard didn't occur to me back then. Probably I was too close to my specific little issue and not thinking generally. Spammers have certainly been abusing generic-word domains for advance fee fraud and phish spams for years.
We now have an officially standardized 'null MX' record
Years ago, I wrote about how I wished for an official 'null MX' standard so that I could clearly advertise that some of my hosts should never be sent email although they had an A record and there was a mailer listening on that IP address. In the process of writing another entry on this, I decided to look up the current state of the draft RFC from 2013. Imagine my pleased surprise to find RFC 7505: A "Null MX" No Service Resource Record for Domains That Accept No Mail.
RFC 7505 was issued June
2015, which gives me some time to have not noticed it. The official
standard is a 0-preference
MX to '
.' (the zero-length DNS label),
which is probably slightly stricter than previous interpretations
but is also probably what people have been doing anyways. Since
this essentially standardizes existing practice, at least some
mailers have been implementing RFC 7505 since the moment it was
published; others undoubtedly don't support it yet and will either
fail the mail message with an unclear error, ignore the MX entry,
or consider it a temporary DNS error.
Postfix apparently picked
up official support for RFC 7505 in version 3.0 (released February
2015, while the RFC was in draft). I can't find any particular
indication if other mailers have picked up explicit support for it
(somewhat to my surprise); perhaps the authors of them are as unaware
of RFC 7505 as I've been. Alternately, they were already rejecting
email for things when there was only a '
MX .' MX entry, so they
don't really have anything to do.
(And of course there are plenty of really old mailers out there on
the Internet that will probably skip over a '
MX .' entry as clearly
malformed and carry on to try to deliver the message to the IP in
the A record, just the same as before.)
Since this is now an official RFC, I'm actively tempted to publish
MX .' entries for some hosts and see if anything happens
as a result. It would be nice to think that spam senders will notice
and I'll see a drop off of delivery attempts, but I'm not really
A single .jar recognized as several types of malware at once
In the spirit of the single email message with a lot of malware, I'll once again show you the log messages first:
1cwivp-0006vh-1M attachment application/zip; MIME file ext: .zip; zip exts: .jar; inner zip exts: .ai .b .box .class .download .drive .mf .ph rejected 1cwivp-0006vh-1M from firstname.lastname@example.org to <redacted>: identified virus: CXmail/JarZip-A, CXmail/Java-A, Java/Adwind-KU
Here we have a .jar inside a .zip (which is somewhat but not totally suspicious), and from this single incoming email our system felt it found three bad things.
Sophos's detailed information for CXMail/JarZip-A is not really detailed. It's possible that this is simply their name for some apparently recognizable family of .jar-in-.zip malware; as I'd hope, some testing has shown that it's not as comprehensive as 'all .jars inside .zips'. CXmail/Java-A has similarly generic information available. Java/Adwind-KU is apparently the more well known thing, and has apparently been around for some time.
It turns out that we've seen Java/Adwind-KU before, and in the recent past cases our Sophos PureMessage reported it as 'CXmail/JarAd-G, Java/Adwind-KU'. These cases appear to have been straightforward .jar attachments. We have some earlier hits that were reported as Java/Adwind-KU alone, and back then they were were .jar-in-.zips again. All of which goes to show that this sort of stuff evolves, both in form and in recognition.
When I started writing up this case I wondered if I had a situation where several pieces of malware had all rolled themselves into a single .jar file. Now that I've looked at this it appears that this is instead a single piece of malware that triggers multiple detection signatures inside Sophos PureMessage, presumably based on how it's decided to pack itself up.
The message was sent early Saturday morning from 18.104.22.168,
which isn't listed in any major DNS blocklist as I write this (it's
in Barracuda's blocklist, but that's still a relatively hair-trigger
one). Given its
To, it's obviously bad,
although it didn't seem to score as spam as well as something with
(As a hint for anyone writing virus messages, if you give a message
the subject of 'URGENT NEW ORDER PO1605MP1-00077' and then have the
To: be the same as the
From:, things are going to look more than
a little bit suspicious to anyone who actually reads the message.)
PS: I don't know what
.drive extensions are likely
to be in .jars, but they at least sound a bit suspicious. On the
other hand they could be used for something completely different
in real JARs; I have very little idea what Java file extensions are
normally found in them. Perhaps we should figure that out so we can
identify highly suspicious extensions, but that's too much work for
(One of the rules of anti-spam work is that there's always something more you could be doing, and thus you always have to draw the line somewhere and say 'we could do that, but let's not'.)
Spammers probably aren't paying any particular attention to you
As I sort of mentioned in yesterday's entry, I have historically written SMTP time rejection messages and other things with an eye towards denying spammers information about exactly why their attempts were rejected. This certainly looks like a perfectly rational decision; if we leak (detailed) information about rejection reasons, we give spammers a head start on working out what about their attempts needs to change in order to get their spam through. And indeed you can find plenty of large sites, like GMail and Yahoo, that absolutely refuse to give out any detailed information about rejections for this stated reason.
There's a difference here, though; we're not Yahoo or GMail. We don't have millions of users that spammers really want to send spam to; we have a thousand or so. The payoff for working around GMail's spam filtering is very high; the payoff for working around ours is extremely low. As a result, the odds that any spammers are actually paying attention to our SMTP rejection messages is, well, very low. In practice it's extremely likely that most spammers never even see them and have no interest in attempting to work around our specific tricks.
(I suspect that there are still some spammers who are paying more attention, such as people doing targeted phish spam runs and the conference spammers. Both of these groups are definitely at least somewhat targeted, and the most precise and alarming of the phish spammers are at least doing a reasonable amount of specific research on us.)
Given this realization, I've come around to feeling that your spam rejection messages might as well be reasonably informative (unless you're a big target for some reason). Maybe once in a while a spammer will read one and get a leg up, but in practice they're far more likely to be read by someone's dealing with a false positive or some other similar problem (such as trying to send an attachment type that we block). We might as well be reasonably helpful to those people, especially since some of the time they may be us (as we try to diagnose why a rejection happened).
This has probably always been the case, but I also think that when you're actively trying to block spam it's easy to get into a mindset where, to put it one way, everything is personal. Clearly the spammers are out to get their spam past you in particular and so you'd better be careful, just like the big people are. It's humbling to think that our small mail environment is generally insignificant from the spammers' perspective.
Making your SMTP rejection messages be useful for you
Our external mail gateway will reject (some) incoming messages during the SMTP conversation if our anti-spam system thinks they have too high a spam score. Until today, they were rejected with a deliberately bland and uninformative SMTP error message:
550 Rejected: this message looks too much like spam
When I designed this message, I wrote a comment about it saying 'rejections for spam deliberately give the sender an uninformative message because I don't feel like giving spammers clues'. Then today we got called in to help troubleshoot an issue where a (valid) email message from outside had bounced, and all we had to go on was this message.
Well, you know what: spammers probably aren't reading our SMTP rejection messages anyways, but we certainly do every so often. If we're reading the message this version is exceedingly unhelpful; in fact it's so generic that it's not immediately clear if it's from our system or some other system. So now our SMTP time rejection message for spam says this:
550 Rejected: CSLab PMX spam score too high (milter id <something>)
This new form does several things. First, it clearly identifies to us that the message comes from our external mail gateway. Then, between the 'milter id' and the 'PMX spam score' wording, it tells us which SMTP-time rejection is being triggered here; it's our milter-based system. Finally, the <something> is the Exim (log) ID that was assigned to the proto-message as it was being received. Using this ID we can efficiently retrieve all of the other information about the message from our logs, including the specifics of its spam score (such as they are, given that Sophos PureMessage's spam scoring is basically a black box).
Having done this exercise for one SMTP rejection message, I'm sort of tempted to do it for others. If I start from the premise that someday a user will turn up saying 'someone trying to mail me got this message', what do I want to see in the message so we can explain the situation to people?
(The good news is that I took a quick look and almost all of our other SMTP rejection messages seem to include the crucial information. For example, our 'rejected because the sending IP is in Spamhaus' SMTP rejection message actually includes the IP address, so we don't have to try to correlate logs with whatever vague information we have about the rough time the message was sent to the particular user in order to find it.)
By the way, one consideration here is that you don't necessarily want these messages to be too long, because some SMTP senders will truncate your rejection message when they report it to users (or at least they used to). I believe I've seen ones that only report the first line, for example. This is why our current rejection message is going to be relatively cryptic to anyone but us; I cautiously squeezed it down to something that I felt had a relatively high chance of making it back to us intact.
I don't get many bounce messages these days, so it's possible that modern mail systems no longer suffer from this issue. Certainly mail providers like Google and Yahoo generate quite long and verbose multi-line SMTP rejections and temporary failures. Perhaps I should add a second line with a clear, normal person focused explanation for anyone who trips over this as a legitimate false positive.
Some DNSBL developments I've just heard about
I mentioned recently that choosing DNS blocklists isn't necessarily a one-time thing that you set and forget. I always knew this in a vague and general way, but I had mostly ignored it until recently. More specifically, until I was writing that entry and wound up looking at the CBL front page, which had a March 24th announcement of news about the PSKY DNS blocklist. To wit, that PSKY had apparently been 'borrowing' Spamhaus data without authorization, that this has been stopped, and that it wasn't clear if they listed anything much any more. We've never deployed PSKY on our main mail server, but I had deployed it on my personal sinkhole spamtrap and it had been having a pretty good hit ratio. 'Had' being the operative word, because starting around the appropriate time I'd not really logged any hits against it.
All of this sent me reading through the rest of the 'Other DNSBLs' portion of the CBL's FAQ. Some of their current opinions match mine (such as Barracuda's public DNSBL being quite aggressive), but others were a surprise to me. Most prominently, the CBL people feel that the current Spamcop BL is now sufficiently safe to use as a general DNS blocklist, where my past experience with it (from several years ago) was that it was too hair-trigger. The rest of the FAQ is interesting in its own way, mostly in that it seems to confirm that there aren't really very many effective DNSBLs any more. Or at least not very many that the CBL feels that they need to talk about.
All we use in our spam filtering is Spamhaus, and I don't think there's much chance that we'll change that. The Spamhaus ZEN is as close as we can get to a high trust, fire and forget DNS blocklist, and even then our users have to opt in to it. But it doesn't hurt to keep an eye on the DNS blocklist landscape every so often (even if there seems to be less landscape than there used to be).
(That diminishing landscape is one reason I'm saddened by the news about PSKY's blocklist. When I first heard of them, they were the first new and effective DNSBL for some time, and frankly we can always do with more good spam-blocking.)