The types of TLS seen on our external MX (as of April 2023)
On the Fediverse, I said:
Today's sysadmin tip: if you don't want to be depressed, don't look at how many other mail servers are still connecting to your external mail gateway with TLS 1.0, and especially not exactly who they are.
Today I feel like providing some statistics on that, partly for my own interest. All of these are over the past full nine days, which means that they mostly cover the end of April 2023 (plus May 1st).
Over this time we accepted 94,037 messages, of which 62,885 were encrypted with some version of TLS. The TLS versions used break down like this:
36426 X=TLS1.2 26209 X=TLS1.3 229 X=TLS1.0 21 X=TLS1.1
After my Fediverse post, I'm actually surprised to see such a low usage of TLS 1.0 and 1.1. I'm pleased to see that TLS 1.3 is so close to TLS 1.2.
(I think what I was seeing in my Fediverse post was that outside mailers were making a handful of connections a day with TLS 1.0 and TLS 1.1. At the time the TLS 1.0 connections stood out more.)
I don't particularly know why TLS 1.1 is so uncommon compared to TLS 1.0. It may be that TLS 1.1 was only the latest version of TLS for a few years (based on Wikipedia's dates). There was probably a relatively narrow window of time for people to have developed and shipped TLS 1.1 products (and then never updated them to TLS 1.2).
Ubuntu 22.04's version of Exim conveniently formats the full cipher name in a way that makes it easy to get a top level view of the broad signature schemes in use:
25774 X=TLS1.3:ECDHE_X25519 19678 X=TLS1.2:ECDHE_SECP256R1 11159 X=TLS1.2:ECDHE_SECP384R1 2916 X=TLS1.2:ECDHE_SECP521R1 2599 X=TLS1.2:ECDHE_X25519 435 X=TLS1.3:ECDHE_SECP256R1 203 X=TLS1.0:ECDHE_SECP256R1 74 X=TLS1.2:RSA 26 X=TLS1.0:RSA 16 X=TLS1.1:ECDHE_SECP521R1 5 X=TLS1.1:RSA
Overall, there were 34 different full cipher suites used, and so I'll give a little breakdown by TLS protocols (partial for TLS 1.2):
13796 X=TLS1.3: ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_128_GCM: 128 11960 X=TLS1.3: ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM: 256 424 X=TLS1.3: ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM: 256 18 X=TLS1.3: ECDHE_X25519__RSA_PSS_RSAE_SHA512__AES_256_GCM: 256 11 X=TLS1.3: ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_128_GCM: 128 13377 X=TLS1.2: ECDHE_SECP256R1__RSA_SHA512__AES_256_GCM: 256 11089 X=TLS1.2: ECDHE_SECP384R1__RSA_SHA256__AES_256_GCM: 256 3719 X=TLS1.2: ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_128_CBC__SHA1: 128 2880 X=TLS1.2: ECDHE_SECP521R1__RSA_SHA512__AES_256_GCM: 256 2037 X=TLS1.2: ECDHE_SECP256R1__RSA_SHA256__AES_128_GCM: 128 1820 X=TLS1.2: ECDHE_X25519__RSA_SHA512__AES_256_GCM: 256 497 X=TLS1.2: ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_128_GCM: 128 433 X=TLS1.2: ECDHE_SECP256R1__RSA_SHA512__AES_128_GCM: 128 [...] 16 X=TLS1.1: ECDHE_SECP521R1__RSA_SHA1__AES_256_CBC__SHA1: 256 5 X=TLS1.1: RSA__AES_256_CBC__SHA1: 256 203 X=TLS1.0: ECDHE_SECP256R1__RSA_SHA1__AES_256_CBC__SHA1: 256 26 X=TLS1.0: RSA__AES_256_CBC__SHA1: 256
(I've added spaces after the :s for better line wrapping.)
As we can see here, TLS 1.2 contributed the largest diversity; it has 25 different full cipher strings. I believe this reflects a wide diversity of opinions in the sending MTAs, because the Exim documentation says that the client (here, the sending MTA) picks the preferred cipher if you're using GnuTLS, as the Ubuntu Exim is.
Sidebar: the TLS 1.2 RSA ciphers
44 X=TLS1.2: RSA__AES_256_CBC__SHA1: 256 18 X=TLS1.2: RSA__AES_256_GCM: 256 12 X=TLS1.2: RSA__AES_128_CBC__SHA1: 128
I don't know how horrified I should be here.
The chain of landing web pages that I saw for a phish spam today
Over on the Fediverse, I shared a phish-related discovery:
Today's discovery: people hosting phish landing forms in IPFS and using Cloudflare's IPFS gateway to do the work of web access to them. Nicely played. Everyone is going to point fingers at everyone else.
(As usual the email has a different URL, with a 'this is our secure document link' that takes you to the IPFS hosted form.)
Let's be a little bit more specific, because it's a useful example of just how complicated these things can be.
The email was a 'X has shared a file with you' email with a link to a page on what claimed to be a travel company's Sharepoint aka OneNote site (under '<...>-my.sharepoint.com', which you probably aren't going to be able to block). Based on the URL, this may have been a page created by and for a particular user, instead of a corporate page, meaning that just this user had their Sharepoint access compromised. This page said:
<company> transmitted a secured RFQ itinerary Doc
To view the doc, click the link below
(The spelling here is authentic.)
That link took you to a cloudflare-ipfs.com URL, which displayed an official looking Adobe 'Verify Your Identity' thing asking you to sign in:
You've received a secure file
[PDF icon] 58.3 Kb
To receive and download this PDF file , please enter specific professional email credentials that this document was sent to.
(On the one hand, this is your web browser clearly asking you to authenticate to see a PDF. On the other hand, Adobe has clouded up all its PDF programs, so sure, why not assume that Adobe runs a secure PDF sharing thing for people as part of that.)
(This uses the same idea of 'you must authenticate to view this PDF' but with more steps than the other recent-ish time I've seen this trick. That this bounces you through a website may make it more plausible to people, since having to authenticate to access things for you on the web is not uncommon. It may raise large red flags to technical people who stop to think about the mechanics, but we're not really the target audience for this attack.)
The case of the very wrong email
Over on the Fediverse, I shared a discovery from our mail logs:
In the 'that's not how you do it' category, spotted in our email reject logs today:
(We rejected it for having an absurdly long line, over 200,000 bytes, which appears to have been almost all of the message.)
The MIME Content-Transfer-Encoding header is supposed to tell you the encoding of the MIME part in question, including the implicit top level part of the email. Typical values are things like '7bit', '8bit', 'quoted-printable', or 'base64'. Needless to say, this email's C-T-E is complete garbage, and a picky email client would say that it couldn't decode the message because it doesn't understand the 'amazonses.com' encoding.
(I suspect that real clients treat this as an unset C-T-E and either assume they have text or try to guess among the options.)
All of this email appears to be spam, of course. And the message has other anomalies besides the absurdly long lines in the body. They seem to have a consistent envelope sender domain (that I'm not going to mention for reasons), but the headers follow an unusual pattern. If the email is sent to 'USER@<our-domain>', all of the samples I've checked have the following header setup:
From: [...] <support@USER.net> Sender:USER@<our-domain> Message-ID: <...email@example.com..org> Content-Type: text/html Content-Transfer-Encoding: amazonses.com
That 'Sender:' header is malformed, and obviously you aren't supposed to have a constant Message-ID. Obviously some of the addresses are made up (and forged, for the sender address); many of the From: domains probably don't even exist. While the envelope sender domain stays constant, the local addresses do vary. The current sending IP has also been consistent over today.
(At the moment the MX for the envelope sender domain is outlook.com, and they reject a random claimed envelope sender address. Of course this spammer could be forging the domain of an innocent bystander, which is why I've decided not to mention it.)
The obvious speculation about where the gigantic line comes from is that the messages have extremely bloated HTML of some sort and it's all been crammed on to one line. I don't know what you do to get a 200,000 to 340,000 characters of HTML in an email message; maybe they're including images as inlined 'data' URLs.
How to block people's automatic mail forwarding (to GMail, at least)
Suppose, hypothetically, that you're the kind of person who is certain that your email is so sensitive that it should never be automatically forwarded. If you send email to firstname.lastname@example.org and the person likes to forward their email to GMail, well, tough. Your email is too important; they can read it through example.org or not at all. Given the anarchy of Internet email, it sounds like this would be hard to achieve, but don't worry; modern email standards have your back here, at least for places (like GMail) that generally respect them.
Here's what you do. First, configure a strict DMARC policy for your domain, one that tells receivers that you want them to reject any email that doesn't pass DMARC. Then, set up a restrictive SPF policy, one that definitely only passes things sent from your server. Finally, the important step: don't sign your outgoing email with DKIM.
Since you have a strict DMARC policy, receivers like GMail will
reject email with a '
From:' header with your domain that doesn't
pass DMARC checks (this is DMARC alignment). Since you do have
a (restrictive) SPF record, email send directly from your email
servers will pass SPF checks and so pass DMARC alignment. But since
you don't DKIM sign messages, if GMail receives email from anywhere
else with your domain in the From: header, the email will fail
DMARC; it can't pass a DKIM check because there's no signature, and
it can't pass a SPF check because it doesn't come from you.
Some automatic forwarding will change the envelope sender (the SMTP MAIL FROM) so that it will pass other people's SPF checks (this can be done with SRS or other mechanisms). But very little automatic mail forwarding changes the From: header address, partly because doing so makes it much harder for the person receiving it to do things with the email. And if the forwarding system adds its own DKIM signature, nothing really changes because the signature won't be for your domain and won't count for DMARC alignment.
I regret to inform you that there are mail systems out in the world who are actually doing this, although perhaps they aren't doing it deliberately. Maybe their DKIM signing has broken, or doesn't cover all of the email they sent, or just never got implemented. These people even send mail to people at universities, I assume deliberately. Not all of that email gets through.
(People can of course still manually forward your messages, because manual forwarding generally creates a From: header with their email address, and now what matters is their DMARC policies, DKIM signatures, and SPF records, which their email probably passes if they want it delivered.)
PS: Possibly Google's SMTP rejection messages that I've seen for this have been incomplete, in that maybe Google wouldn't have been as insistent on DMARC alignment in other situations. I saw this with Message-ID headers.
(SMTP email long ago stopped being a fully predictable or understandable system, as systems took increasing measures to defend themselves against spam.)
Our current plague of revolving .top and .click spam email domains
Email spam is somewhat like the weather, and much like the weather I don't talk about it much any more. However, every so often something unusually unpleasant happens (in both of them). Our current irritation in spam weather is what I suspect is one particular spammer that operates using a rapidly changing flux of spam domains in .top, .click, and on some days .us, using a distinctive (but not really machine matchable) pattern of tagged envelope senders.
The typical pattern of envelope senders are ones that look like this:
This '<phrase>-<user>=<domain>@<random>.(top click us)' envelope sender is quite human recognizable and is clearly tagged, but it's not all that easily matched without false positives. The tagged envelope sender is less useful than it looks, because none of these domains actually accept email.
The spammer is fast moving at changing both sending domains and sending IPs. Their domains and IPs tend to wind up listed by people like Spamhaus within an hour or three, but by then they've moved on. They don't seem to reuse domain names very much (or very fast, when they do reuse them), but in a spot check they did reuse IP addresses over the past couple of days, perhaps as they fall out of the SBLCSS and similar DNS blocklists. Possibly the spammer reuses domain names less often due to them expiring from DNS blocklists more slowly than IPs.
(In fact now that I'm looking at this seriously, this spammer appears to be only using three /24s for the past week or so.)
Unfortunately our current anti-spam software (rspamd) doesn't immediately recognize this mail as spam (although once things are DNSBL listed it can do better). GMail is wise to their tricks, of course, and so email from this spammer to people here who forward their email to GMail is rejected by GMail at SMTP time, giving us bounces that pile up in our queues as we try to deliver them to the spammer (who is, as mentioned, not taking email). The messages have valid DKIM signatures and even pass SPF checks (to my amusement the spammer thoughtfully lists the domain's sending IP in their SPF record).
(GMail typically rejects the email with messages about the reputation of the sending domain being too low, but as I've seen with Message-IDs, GMail's rejection messages aren't necessarily anywhere near the whole truth.)
In a spot check of these domains, DNS service is being provided by Cloudflare, perhaps via some free plan or perhaps as part of the registrar's offering. WHOIS for recent domains lists the registrar as NameSilo LLC (although the domains might have been obtained through a reseller), who appear to offer very low up front costs for registering domains in some of these TLDs. Still, the churn in domain names suggests to me that the spammer probably isn't paying for them in one way or another.
The side effects of this particular spammer are sufficiently annoying that I may take some specific steps to deal with them. While there are a bunch of clever, complicated options, it's possible that quite brute force ones would be sufficient.
(Mostly I'm irritated that people are letting them get away with going through so many domains. Domain registration is supposed to cost money, and domains aren't supposed to be expendable things for spammers. Yet here we are, with 'fast flux' domain names.)
An email's Message-ID header isn't a good spam signal (in late 2022)
I recently wrote about maybe copying email anti-spam measures from large places like GMail, using the example of how GMail was rejecting various messages at SMTP time with a reported reason of 'messages missing a valid messageId header are not accepted'. This spurred me into investigating what sort of Message-ID values we saw (which can get complicated to evaluate).
The good news is that Exim actually already logs the Message-ID value for every message in the 'id=' field logged as part of message reception logging. It was still more convenient to add my own logging that called out some specific aspects, but Exim's normal logging meant that I could already do some useful things with our historical data.
The bad news is that it turns out that the Message-ID header isn't a strong signal about whether or not the email was spam, and as part of that GMail is not being entirely honest in their SMTP time rejection messages. In the time when we were doing detailed logging, I saw a reasonable amount of real, desirable email without a Message-ID header at all (including a message to me), and some amount of it with what looked like 'invalid' Message-ID values. There's clearly some real mail sending systems that just don't put in a Message-ID.
As for GMail, once I realized that Exim already had this information, I went back through our logs of email forwarded to GMail. It's true that all of the messages GMail rejected with this SMTP message had missing or questionable Message-ID values. But GMail has also accepted plenty of forwarded email from us that didn't have a Message-ID header. The lack of a Message-ID header by itself is clearly not enough to cause GMail to reject email, which isn't surprising given that some amount of email that people want to get will show up at GMail's door without a Message-ID.
(This GMail behavior does save us from any worries of needing to add our own Message-ID header to any non-spam email being forwarded to GMail.)
Due to Andy Balholm's comment on my previous entry, I also now know that rspamd defaults to giving missing Message-IDs moderate spam points and 'invalid' ones somewhat fewer. A missing Message-ID is MISSING_MID, +2.5 points, and an 'invalid' one is INVALID_MSGID, +1.7 points. You can find this in the rspamd source code in rules/regexp/headers.lua.
(I haven't dug deep enough to figure out what rspamd considers to be 'invalid' here. As I found out, it's complicated even if you try to simplify it.)
(Maybe) copying email anti-spam measures from Google and company
For a while now, Google has been rejecting some messages we try to forward to GMail with a SMTP error message like this:
Messages missing a valid messageId header are not accepted.
You can have a number of reactions to this. One of them is to be grumpy that Google is rejecting email that's otherwise (probably) perfectly valid and perhaps not even spam. Well, let's be honest here; all competent modern mail system operators reject email at SMTP time for all sorts of peculiar reasons, so I can hardly pick on GMail for not liking messages without message IDs when we will reject your messages if they an attachment type we don't like or ClamAV matches a signature.
Another reaction, one that I'm more and more leaning toward, is to consider making our email system reject external email at SMTP time for the same reason. Why? Because if GMail is doing it, a missing (or invalid) message ID is probably a good sign of spam. The people running GMail don't just roll out of bed one day, pick an RFC header requirement at random, and start rejecting email that violates it. Instead it seems very likely that they have a bunch of data that shows that rejecting email this way is a good idea.
(Of course we don't actually know if GMail is rejecting the email for this reason alone. There could be other signals involved that GMail isn't putting in the SMTP rejection message for various reasons.)
More broadly, I'm increasingly coming to think that major email providers have a lot more data on spam signs than we do, so we might as well take advantage of their work when possible. If they give us a relatively clear signal that they consider something a spam signature, maybe we should use that signal ourselves. At the very least it's probably worth investigating, for example to see how many messages have invalid or outright missing message IDs, and what happens to them.
(It's possible that rspamd can already recognize and log bad or missing message-ids, but if so I can't find it in the documentation on a casual search.)
An email phish attempt using attachment file type confusion
I don't get much spam email in general and I get even less that has malware payloads, so in one sense it's always interesting when one makes it through our various anti-spam measures and I get to actually look at a sample for myself. Today I received what looked like a malware attack using a PDF:
Subject: [...] has sent you a document(s)
File Name: Invoice-38937.pdf
File Size: 44 KB
Please find attached Invoice-38937 for your reference.
I was all ready to start cracking the PDF open with various tools to see what they could tell me, when I actually extracted the attachment and looked at the full filename and file type:
Content-Type: application/octet-stream; name="Invoice-38937.shtml"
The actual attachment was a HTML file that contained a single form that POST'ed off to a website, with a fixed 'Email address' field and a password field for you to fill in. The HTML design was set up to try to look plausible as a PDF that you had to enter a password to see, with a blurred, dark background image that looked sort of like a blurry invoice and an 'Adobe PDF / Sign in to view invoice payment' popup, a page title of 'Adobe ID', and so on.
(The form's POST target was a HTTP URL instead of a HTTPS one, but I think only Firefox warns you about that.)
At one level this is unexceptional and probably unsurprising. At another level, I find it interesting that this sort of attachment file type confusion actually works (or at least I assume it works enough for spammers to keep using it). It wouldn't work in the mail environment I use, where a completely visually different program is run to display a PDF than is run to display a HTML file, but in an 'all in one' environment where the mail client tries to display as much as it can itself (and where browsers display PDFs too), I can see how there might not be clear visible signs that you're not really looking at a PDF.
To me, this also points out a weakness in common mail environments. This file type confusion shouldn't really work; you shouldn't be able to pass off a HTML file as a PDF (although PDFs can contain plenty of dangerous things in their own right). You could also argue that a HTML file opened directly in a mail client shouldn't be allowed to submit any forms, but there are probably people who actually rely on this working for some internal process they do.
(Email attachment file type confusion is routinely exploited by malware to try to, for example, persuade you that an executable is a PDF so you'll click on it.)
DKIM signature types (algorithms) that we see (as of July 2022)
A lot of email these days is signed with DKIM, partly because signing email with DKIM is increasingly mandatory in practice. But 'signed with DKIM' is a broad category because DKIM has more than one signing algorithm and on top of that is used with (public) keys of different lengths.
What signing algorithms DKIM supports in practice is a matter for some discussion. The initial DKIM RFCs, such as RFC 6376, support rsa-sha1 and rsa-sha256. RFC 8301 deprecates rsa-sha1 and says that it shouldn't be used (and that a message with only a rsa-sha1 DKIM signature should be considered to fail validation). RFC 8301 also says RSA keys must be at least 1024 bits long and should be at least 2048 bits; again, messages with too-small keys should be considered to fail validation. RFC 8463 defines Ed25519 based DKIM keys, but apparently very few big providers actually support them, which makes them relatively pointless and useless in practice. Probably the most broadly useful algorithm and key length is rsa-sha256 with 2048 bit keys.
Over the past ten full days, our central mail server has seen almost 85,000 DKIM signatures on 75,100 messages (a single message can have multiple DKIM signatures). Over the same time the machine received about 96,000 messages (7,000 of them internally generated by users and machines here). Signature algorithms break down as follows:
44074 a=rsa-sha256 b=1024 37865 a=rsa-sha256 b=2048 7141 a=rsa-sha1 b=1024 311 a=rsa-sha1 b=2048 18 a=rsa-sha256 b=1016 8 a=rsa-sha256 b=768 5 a=rsa-sha256 b=1032 4 a=rsa-sha256 b=4096 3 a=rsa-sha1 b=4096 1 a=rsa-sha256 b=3072 1 a=rsa-sha1 b=768 1 a=rsa-sha1 b=2056
If I look only at verified signatures, the numbers are a bit different:
40270 a=rsa-sha256 b=1024 32221 a=rsa-sha256 b=2048 1880 a=rsa-sha1 b=1024 205 a=rsa-sha1 b=2048 5 a=rsa-sha256 b=768 4 a=rsa-sha256 b=1032 3 a=rsa-sha1 b=4096 1 a=rsa-sha256 b=4096 1 a=rsa-sha1 b=768 1 a=rsa-sha1 b=2056
(Despite RFC 8301, Exim remains willing to verify DKIM signatures using either or both of rsa-sha1 and keys under 1024 bits.)
The largest shrinkage is in 1024-bit rsa-sha1. Since our central mail server sees messages after their subject line may have been marked as spam, some of this drop may be spammers using 1024-bit rsa-sha1. In general our external SMTP gateway sees significantly fewer 'headers probably modified' verification mismatches than our central mail server does. But even our external SMTP gateway sees about 4,400 'headers probably modified' mismatches over the same ten day period.
(And even on our central mail server about 74,600 DKIM signatures across about 62,200 email messages did verify. So a lot of our email does have good DKIM signatures.)
PS: It's a deliberate more or less design decision that if we think a message is spam, we break the DKIM signature by tagging the Subject with a marker. Us tagging the Subject predates any widespread use of DKIM and people here expect it, but when DKIM started to be a thing we (I) thought about it and decided that this was a feature.
Signing email with DKIM is becoming increasing mandatory in practice
For our sins, we forward a certain amount of email to GMail (which is to say that it's sent to addresses here and then we send it onward to GMail). These days, GMail rejects a certain amount of that email at SMTP time with a message that some people will find very familiar:
550-5.7.26 This message does not have authentication information or fails to pass authentication checks (SPF or DKIM). [...]
(They helpfully include a link to their help section on "Make sure your messages are authenticated".)
As far as we can see from outside, there are two ways to pass this authentication requirement. First, the sending IP can be covered by actively positive SPF authorization, such as a '+a' clause. GMail actively ignores '~all', so I suspect that they also ignore '+all'. Second, you can DKIM sign your messages.
There are people who don't like email forwarding, but I can assure them that it definitely happens, possibly still a lot. Unless you want your email not to be accepted by GMail when forwarded, this means you need to DKIM sign it, because forwarded email won't pass SPF (and no, the world won't implement SRS).
GMail is not the only large email provider, but they are one of the influential ones. Where GMail goes today, others are likely to follow soon enough, if they haven't already. And even if other providers (or GMail) accept the message at SMTP time, they might use something similar to these requirements as part of deciding whether or not to file the new message away as spam.
I'm not really fond of the modern mail environment and how complex it's become. But it is what it is, so we get to live with it. If your mail system is capable of DKIM signing messages but you're not doing so yet, you should probably start. If your mailer can't DKIM sign messages, you probably need to look into fixing that in one way or another.
(We're lucky in that we're DKIM signing locally generated messages, and unlucky in that we do forward messages and so we're trying to figure out what we can do to help when the message isn't DKIM signed.)
Appending: The uncertainty of SRS and GMail
SPF's usual answer to how it breaks forwarding messages is SRS. However, it's not clear that SRS or any other scheme of rewriting just the envelope sender will pass GMail's SMTP authentication checks, because GMail's help specifically says (with emphasis mine):
For SPF and DKIM to authenticate a message, the message From: header must match the sending domain. Messages must pass either the SPF or the DKIM check to be authenticated.
SRS and similar schemes normally rewrite the envelope sender but not the message From:, and so would not pass what GMail says is their check (whether it actually is, who knows). Effectively GMail is insisting on DMARC alignment even without DMARC in the picture.