The quietly impressive thing mail clients do when you write HTML mail

November 29, 2023

It's not news that a great deal of email in the world today is written in HTML, and has been for some time. If you insist on plain text email, you're increasingly an anachronism. Many people writing email probably don't even think about people who prefer plain text, and I think many mail clients will default to HTML even if you're replying to a plain text message, so even if you write to me in HTML and I write back in plain text, your reply is back to HTML again.

But while that description is true at the level of what people experience, it's not true at the technical level (at least, not usually). Even today, most 'HTML' email is actually a MIME multipart/alternative, with text/plain and text/html alternate parts. An awful lot of the time, the contents of the text/plain part isn't a little message saying 'read the HTML', it's in fact a faithful plain text version of the HTML that people wrote. Pretty much universally, mail clients quietly create that plain text version from the HTML version that people write, following an assortment of conventions for how to render HTML-isms in plain text. Looked at from this angle, it is quietly impressive. Here is a feature and a chunk of code that could be considered partially vestigial, yet almost everyone implements it.

One of the reasons this may be somewhat easier than it looks is that people rarely literally write HTML in their mail client. Instead they tend to work in a WYSIWYG environment, where the mail client can mark up the text with intentions, like 'bold' or 'links to <X>', and then render the intentions in both HTML and plain text. But I'm only guessing about how mail clients implement this. I don't think it's as simple as pushing the HTML through some sort of plain text rendering, because the plain text and the HTML sometimes change styles for things. For instance, in the HTML, bits quoted from the message being replied to may be indented, while in the plain text they get rendered with the customary '> ' in front of them.

It's not only mail clients used by people that (still) do this. A fair number of major sources of (HTML) email more or less automatically generate a plain text version as well, often coming at least partially from people's input. For one example I experience regularly, Github issues are natively in Markdown and are commonly seen in HTML format, but Github faithfully makes a quite usable text/plain version. It might not be much effort with Markdown, but it's at least some.

(Not all plain text plus HTML email has the same content in both forms, and sometimes the plain text content is broken in various ways (also). But this is still the exception; the vast majority of these emails that I get have functionally the same content in both the plain text and the HTML version.)

(This entry was sparked by me idly wondering if it would be possible to easily write HTML-format emails in MH-E, and then realizing that it wasn't enough to just write HTML emails; I'd need to generate the plain text version too.)

Written on 29 November 2023.
« Why we scrape Prometheus Blackbox's metrics endpoint
My sysadmin's view of going from Centrex to VoIP (sort of) »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Nov 29 22:18:34 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.