Wandering Thoughts archives

2023-03-29

The case of the very wrong email Content-Transfer-Encoding

Over on the Fediverse, I shared a discovery from our mail logs:

In the 'that's not how you do it' category, spotted in our email reject logs today:

Content-Transfer-Encoding: amazonses.com

(We rejected it for having an absurdly long line, over 200,000 bytes, which appears to have been almost all of the message.)

The MIME Content-Transfer-Encoding header is supposed to tell you the encoding of the MIME part in question, including the implicit top level part of the email. Typical values are things like '7bit', '8bit', 'quoted-printable', or 'base64'. Needless to say, this email's C-T-E is complete garbage, and a picky email client would say that it couldn't decode the message because it doesn't understand the 'amazonses.com' encoding.

(I suspect that real clients treat this as an unset C-T-E and either assume they have text or try to guess among the options.)

All of this email appears to be spam, of course. And the message has other anomalies besides the absurdly long lines in the body. They seem to have a consistent envelope sender domain (that I'm not going to mention for reasons), but the headers follow an unusual pattern. If the email is sent to 'USER@<our-domain>', all of the samples I've checked have the following header setup:

From: [...] <support@USER.net>
Sender:USER@<our-domain>
Message-ID: <...javamail.tomcat@pdr8-services-05v.prod..org>
Content-Type: text/html
Content-Transfer-Encoding: amazonses.com

That 'Sender:' header is malformed, and obviously you aren't supposed to have a constant Message-ID. Obviously some of the addresses are made up (and forged, for the sender address); many of the From: domains probably don't even exist. While the envelope sender domain stays constant, the local addresses do vary. The current sending IP has also been consistent over today.

(At the moment the MX for the envelope sender domain is outlook.com, and they reject a random claimed envelope sender address. Of course this spammer could be forging the domain of an innocent bystander, which is why I've decided not to mention it.)

The obvious speculation about where the gigantic line comes from is that the messages have extremely bloated HTML of some sort and it's all been crammed on to one line. I don't know what you do to get a 200,000 to 340,000 characters of HTML in an email message; maybe they're including images as inlined 'data' URLs.

spam/ContentTransferEncodingVeryBad written at 22:23:40; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.