Some thoughts on new top-level domains being used for spam
Over on Twitter, I had a little exchange:
@thatcks: Another day, another new vanity TLD that I'm never accepting email from (because of spam, of course; the dominant use of vanity TLDs in email senders is for spam).
@MrDOS: This is a self-fulfilling prophecy, though: by denying legitimate mail from these TLDs, you're guaranteeing that no one will ever be able use these TLDs for legitimate mail.
@thatcks: When the spammers get there first, the well is poisoned. Un-poisoning the well is not my (or anyone's) problem; we just not want to be fed poisoned water.
On the one hand, I think that my reaction and final tweet are not wrong. Potential receivers of email are under no obligation to help senders get it delivered, and if something only or mostly sends you spam, well, you can sensibly block it and many people will. As a result, spammers can and do poison certain things, including new top level domains (mostly generic TLDs, but sometimes country ones as well).
(Although I can't find a link to it, I believe I once saw a summary of a study on how many new gTLD domains were canceled or removed almost immediately after creation. For many active gTLDs, a surprisingly large number of new domains went away very rapidly. The study didn't conclusively say they were used primarily for spam and other bad purposes, but that was the obvious speculation.)
On the other hand, this feels uncomfortable close to pushing email further toward a closed system in practice, where only large existing senders of email can get their email accepted and other people are frozen out. Setting up a broad based block of any sort (whether a gTLD or a large network (IP) area) makes it incrementally harder for people to send email from new, not well established hosts, and anecdotally that's already hard.
On the third hard, my personal email box is a much different thing than a large mail provider. Decisions made by Google, Microsoft, and so on about who they will accept email from (and what they will require from that email) have far bigger effects than my decisions do. It also feels like the central decisions of Google and so on are fundamentally different (and more dangerous) than the aggregated distributed decisions of a large number of people, even if they come to roughly the same end result.
I don't have any firm answers, especially universal ones, but I'm not likely to change my own personal blocks. Sorry, gTLDs and people using them, but not really. In the end I care more about my mailbox than anything else, because I've just become too tired of the state of modern email.
(I have mixed views on new TLDs in general, but that's somewhat separate from their use in email.)
Errors during SMTP conversations aren't trustworthy, illustrated
Recently we had a mail problem where we could not deliver email to a particular remote destination for a while. A major Australian ISP spent six days telling us:
421 4.7.25 Temporarily rejected. Reverse DNS for <our-IP> failed. IB108
(Based on Exim log messages, this happened during the initial SMTP connection, before we even EHLO'd.)
Then later the ISP was fine again, sadly after the person trying to send mail had their attempts time out and contacted us to see if we could do anything about it. The ISP was fine before this incident, and they've been fine ever since, and no other destination reported anything like this message to us.
We did not have malfunctioning nameservers or missing reverse DNS for six days. We did not, as far as we can tell, have DNS servers that the outside world had problems reaching for six days. I suppose it's possible that this large ISP had some internal problem that prevented their DNS servers from talking to our DNS servers for six days, but not so big that they noticed it and dealt with it right away. Alternately, perhaps this ISP was not being honest with us about why they decided not to accept connections from our outgoing email server. We can't tell.
(During the six day problem period, our user was able to reach their recipient on this ISP from some other places, both of which are big email heavyweights, so it was not an issue with the recipient or with the ISP's mail system in general.)
It's not really news or a new thing that the messages you get from other people's mail servers are not necessarily telling you the real reason that your messages aren't being accepted. Many of the major mail providers seem to do it; it's been a long time since I really believed GMail's SMTP time messages, for example. We have many cases where GMail will give temporary 4xx SMTP error codes for an email for a while with various claims in the SMTP error messages, then wind up accepting it. In other cases the 'temporary' 4xx error codes stick for as long as we want to keep retrying and we eventually time out the message.
(My personal lesson learned from this incident was that I should pay more attention to our queued email, then look into things that seemed odd. At the very least I might have been able to reproduce this outside of Exim, and test it from other IPs on the same subnet and elsewhere within the university.)
There are limitations to what expendable addresses can help with
I'm a long time advocate of using expendable addresses for as many things as possible (and then making sure you can turn them off). However, yesterday's incident of junk email as a cover for worse also shows some of the limitations of using expendable addresses, because they wouldn't really have avoided this situation.
The first way they wouldn't have avoided the situation (of having a flood of junk email sent to someone to distract them) is that generally expendable addresses in all of their forms still funnel into your actual mailbox. Some people sort some expendable addresses into low-priority places, but you're unlikely to do this with the email address you use for things like notifications from your financial institutions. You usually want to see those right away, not have them hidden away.
The second way they wouldn't have avoided the situation is that if someone wants to unleash a flood of email onto you to distract you, it doesn't necessarily matter what exact email address they get their hands on. All they need is some email address that goes into some mailbox that you look at regularly. It would be better to get the actual email address you use with your financial institution, but for drowning a bit of signal in a lot of noise, often many email addresses will do about as well. It doesn't even have to go to the right mailbox, just one that will cause you to drown in the volume.
(Certainly this would be the case for me. I would have an easier time of sorting things later and perhaps not missing signal amidst noise with my extensive collection of expendable addresses, but in the heat of the moment, if you clog up my inbox it doesn't really matter how.)
The one part of this sort of flood that expendable addresses will help with is the longer term aftermath. One of the iron rules of email addresses is that once some people have their hands on some email address, they will never stop emailing it. After a flood, obviously a lot of people have some email address of yours and a certain percentage of them will keep emailing that address forever. If the address they have is an expendable address that you can turn off, you can at least make them go away.
Junk email as a cover for more nefarious things
This morning, we got a call (through a Point of Contact) that one of the people here was being absolutely flooded by incoming spam and junk email. It was a real flood, too; in total they received over 1,200 email messages that made it past our anti-spam defenses, most of them over about an hour and a half (I'll let you do the math on the messages per minute rate, and then think about trying to do anything about it in a mail client). This person would up having to basically turn off receiving external email.
Unfortunately, this wasn't the only thing going on in that person's life this morning, because they also discovered an unauthorized financial transaction (I don't know if they found it before or after the flood stared, but I suspect before). The obvious theory is that this sudden, exceptional flood of junk email is not at all a coincidence, and was instead intended to cover up a transaction notification from the financial institution involved. To abuse a phrase, if you can't stop a tree from falling, perhaps you can obscure it by clear-cutting the entire forest around it.
We rejected some of the incoming email at SMTP DATA time, which causes Exim to log some message headers. Based on these rejections and also various of the sending addresses, some of the incoming email appears to have been 'congratulations on signing up for our mailing list', 'thank you for contacting us', and so on email that could be deliberately induced by a third party who wanted to flood someone's mailbox. Other messages seem to have been genuine spam, or very likely genuine spam.
(I am sure you will be shocked to hear that Sendgrid features high up in the list of sending sources, and also the list of sources blocked because of SBL listings.)
One of the unnerving things about this incident is that the attacker clearly was highly prepared. They had at least a thousand (or more) potential sources of junk and spam email identified and lined up, ready to trigger. And it's pretty clear that the triggering was automated. Since the sources of the junk email come from all over, it seems likely that the attacker wasn't exploiting a single piece of (web) software to stuff in addresses. They probably had an entire suite of attacks against various different 'contact us' and 'subscribe me' and so on forms ready to go.
(I have no theories for how the attacker got spammers to start emailing this address so fast. Maybe there is a market for 'hot email addresses, mail them now while they last' where the purchased addresses get used basically immediately.)
Real email has MIME attachments that are HTML
One of the things that MIME parts in email have (or can have) is a content disposition, which theoretically tells your mail client whether the MIME part should be displayed as part of the message (a content disposition of inline) or it should be not displayed by the client and you'd be offered the option to save it, view it with something, and so on (a content disposition of attachment).
(HTTP reuses this idea in the Content-Disposition header, which tells the browser if it should try to display the response or jump straight to forcing you to download it or hand it to some external program.)
In most email, HTML MIME parts have an inline content disposition, because this is how the sender (or their mail software) arranges for them to be visible to the receiver. This is true both for a message that is HTML only or for a 'multipart/alternative' message with (theoretically) equivalent plain text and HTML versions.
For a long time, I've known that our commercial anti-spam filter was counting some varieties of phish spam as 'viruses'. When we first started logging MIME part type information, I discovered that a lot of these rejections for for HTML MIME parts that had an 'attachment' content disposition. This led me to assume that essentially all legitimate real mail with HTML MIME parts had them with an inline content disposition, and only suspicious and probably bad email had 'attachment' HTML MIME parts.
Recently I had reasons to specifically look at our MIME part type logs for email that we can be reasonably confident is good, and I got a surprise. We definitely see legitimate email with HTML MIME parts that have a content disposition of 'attachment'. Apparently this is even the standard and normal behavior of some email clients in some situations, especially when forwarding email.
Beyond the specific fixing of my ignorance and assumption here, in general this has been a useful reminder to me that I don't actually know as much about modern email as I usually think I do. Before I confidently assume something like 'HTML MIME parts that are attachments are suspicious', I should at least go check our logs to see what they say. After all, that's the largest reason we collect this information; we realized that we didn't actually know what sorts of MIME parts our users received and we should.
In modern email, it's easy for plaintext and HTML parts to drift apart
I recently read When The Text And Html Disagree (via, itself via), which is about an instance where an email message had an important disagreement between the plaintext part and the HTML part. In this case it was fortunately obvious that something was wrong, but I'm sure there have been less obvious instances.
I believe that one reason this drift happens comes down to that old aphorism that if you don't test it, it's broken. For email with alternate parts, the revised aphorism can be said as “if you don't see it, it's broken”. Modern email clients normally show you the HTML part to start with, and then most make a generally rational decision to make it at least hard to see the plaintext one. So when people look at test versions (or real versions) of such email messages, only the HTML part has to look good in order for the whole thing to seem fine. The unseen text part can quietly rot away, noticed only by unusual people like me who look at the plaintext version.
(You would think that mass email authoring environments would raise an alert if you only edit the HTML portion of a standing mixed-part email, but apparently not.)
I've seen this sort of thing for spam, but When The Text And Html Disagree makes a nice illustration that it's not just spam that suffers from the issue. In the end we probably shouldn't be too surprised about any of this, because keeping multiple things in synchronization is pretty much a hard problem all over. If you want it to work reliably you need to automate it, and automating this sort of update isn't easy.
(Keeping things in sync by hand is extra work, and sooner or later extra work doesn't get done or doesn't get done right. People forget, people make mistakes, people will get to it tomorrow because there's an urgent thing right now, and so on and so forth.)
PS: Given this, the most likely answer to the question in When The Text And Html Disagree is that if there's a disagreement and it's not clear, the HTML part is right and the plaintext one is wrong. It could be that you have a rare email where someone has updated the plaintext part but not the HTML part, but the odds are very good that it's the other way around. The exception to this is if you're in a very unusual environment where most people see the plaintext part instead of the HTML part.
Mailing lists and bounce handling (or not handling bounces) today
Traditionally, if you ran a proper mailing list you were supposed to remove addresses if they started bouncing (or being rejected). The need to automate this was the major reason behind email tricks like Variable envelope return path (VERP), where email to every separate address is sent with a unique envelope sender (and so has to be sent in a separate transaction, instead of letting you batch together multiple 'RCPT TO' addresses at a single destination). There's always been a little bit of an open question about how rapidly you should remove failing addresses, because things happen, but the traditional view (one I've echoed myself) is that you shouldn't let bouncing addresses linger for long.
Then we started having incidents like GMail's major failure yesterday:
It appears that a whole bunch of people are about to discover the downside of various GMail failure modes, and also a bunch of other people are going to stop treating SMTP 5xx bounces as reasons to remove people from mailing lists and so on.
(Since ~4pm Eastern we're seeing GMail '550 no such account' on well established GMail addresses for some but not all deliveries.)
If GMail is going to randomly reject well established email addresses for a third of a day or so every so often, why should you be in any rush to remove failing GMail addresses from your mailing lists? This is even (and especially) the case if the bounces claim that the address doesn't exist; as we've just seen, this rejection reason itself is unreliable from one of the biggest email providers on the planet.
Today, the only thing an email bounce means is 'I'm not accepting this particular piece of email from you right now'. Probably the receiving email system will continue to reject that particular email if it's resubmitted in the future, but not always. There is certainly no strong reason to believe that different email in the future won't be accepted.
If bounces are mysterious, random, and not necessarily predictive of future results, it's pretty reasonable to not use them for anything. It would be nice if people attempt to draw inferences from patterns in bounces (such as 'this email address has never accepted any of our email'), but that takes much more work, tracking of information, and tuning than pretty much ignoring bounces.
(Of course if places like GMail start scoring your email badly if you keep repeatedly sending email that they bounce, then people will go to the work. For GMail. Not necessarily for anyone else.)
I'm not really fond of this result. I would like bounces to be a reliable sign that addresses should be removed from lists, and I would like mailing lists (and databases of email addresses) to remove addresses that bounce. But I can't say that bounces are such a clear reliable sign any more.
The death and life of postmaster@anywhere
A decade ago I wrote The quiet death of postmaster@anywhere, where I proclaimed that the postmaster@ address was dead, killed by steadily increasing spam volume (and decreasing actual use). I was recently reminded of that entry and, a decade later, I have to sort of take back what I wrote. It's true that postmaster addresses are increasingly dead, but when they are it's not for the reasons that I thought back then.
Perhaps ten years ago our postmaster account got a bunch of spam and bounces (I can no longer remember, but what I wrote in my old entry suggests that it may have at the time). These days it doesn't; we get basically no bounces that I can see, and very little spam (and almost all of the spam is recognized and rejected immediately). It's possible that this experience is far from universal, but I can think of some reasons why it might be. And in more anecdotal data, my spam-trap SMTP server has seen only one email to its postmaster address from 2017 onward.
(One potential reason for less spam to postmaster@ addresses is an increased difficulty in sending spam and along with it, an increased professionalization of spam sending. Spamming postmaster addresses is not very productive, so if spamming resources are limited and cost spammers things I'd expect to see less of it.)
But that doesn't mean that sending email to postmaster@ addresses will do you any good these days, or even work. What I've observed from sending the occasional spam complaint is that more and more places seem to be no longer accepting email to postmaster@ and sometimes even abuse@ addresses. I doubt this is driven by spam volume; instead I rather think it's a deliberate policy choice by places, especially by large email providers. Even when email to postmaster@ 'works' for large places, it's probably not reaching the technical people who deal with the mail system, or even necessarily the abuse handling people. It's probably much more likely to drop into some sort of general support system, with the attendant low odds of getting prompt or effective attention.
Of course in one sense this is nothing new. It's been years since sending email to any well known or public address got you useful results with large places (for either spam complaints or technical problem reports). The only real difference these days is you might get an actual SMTP rejection that says your email to postmaster@ or abuse@ is not accepted, instead of your email just quietly disappearing.
But for many smaller and more modest places, I suspect that postmaster@ is still alive. It certainly is here, and even for the overall university email system.
If you send automated email, you should scan it with anti-spam software
Partly in light of Microsoft SharePoint's problem with spam, here is an obvious thing:
If you send automated email to the outside world, you should always run it through some anti-spam system and raise big alarms if they ever trigger.
(If I was doing this today I would use ClamAV and rspamd because they're free and what I'm used to, but anything will do.)
You should do this even if you aren't allowing external people to initiate the email and put content in it (which you shouldn't, because allowing that leads to spam).
You might say that your automated system can never be exploited to send spam. That may or may not be true, but even if your automated email genuinely has no spam, having ClamAV, rspamd, or whatever dislike it is a very bad sign that you should pay attention to, because it likely means that a lot of people will not be receiving the email. And beyond that, checking your automated email is an important and generally easily done insurance policy.
The gold standard for doing this check is to have an external email server in a separate domain with an address (or several addresses) that you send the automated email to on a regular basis, that runs these anti-spam tools. That gives you the closest you can get to how anti-spam tools on other people's systems will perceive your email, complete with any effects from your sending IP and so on. Running scanners on your outgoing email before it leaves your system doesn't quite capture everything, but it will generally cover scanning the content for bad things and it lets you react faster and earlier (for example, by completely stopping the automated email if it triggers anti-spam systems). It may also be easier to implement.
(There are multiple reasons to not put visible spam results on outgoing email in general, and there can be political ones to not block email written by your local people if it triggers your outgoing anti-spam checks. But I feel that automated email is different; it's much less risky to block it, presuming that you immediately alarm on this and get people's attention.)
Microsoft SharePoint is being used to send spam
I'm paying more attention to what our mail system detects as spam and where it's coming from than usual, so I'm getting to notice things (or, in the alternate phrasing, being forced to notice things). Today's thing that I noticed is that to no one's surprise, Microsoft SharePoint is currently being used as a spam sending vector. I say 'to no one's surprise' because it's a long standing rule that anything that can be used to send email to random people with any user supplied content will be exploited for spam (eg).
The email we see is genuine SharePoint email sent from Microsoft, DKIM signed by both sharepointonline.com and 'spoapaceop.onmicrosoft.com', with the envelope address of firstname.lastname@example.org and sent to us by outlook.com machines. Typical headers look like:
From: Katholina Keth <email@example.com>
Subject: Katholina Keth shared "❤Unsatisfied women Need a guy ❤" with you.
The header samples I've seen have a long list of To: addresses at all sorts of places (not just our university subdomain or even the university as a whole). Some messages have a Reply-To: pointing to various addresses at a legitimate domain, which may be a signal that the spammer has compromised a part of that organization so that they can either hijack accounts there or register their own, then use them to register in SharePoint.
(I only have access to some message headers, so I can't tell what is in the body of the email. Hopefully Microsoft doesn't allow SharePoint emails to include substantial amounts of user-supplied content, so all people get is a link to where the spam is.)
At one level all of this is unsurprising. As a product feature, it's attractive to let SharePoint users share their files and other SharePoint materials with people who haven't already signed up with SharePoint, and when you do that of course SharePoint has to tell the target something about what is being shared. The title is an obvious thing to include, and you have to let users change the title of their documents. But now Microsoft has given spammers the ability to send some amount of relatively arbitrary text to relatively arbitrary email addresses.
(I would like to say 'and now Microsoft has a problem', but of course they don't. Very few people are in a position where they can block SharePoint email over this.)