Pragmatic issues with hash verifiers for email messages

April 29, 2009

Suppose that you have a 'sender stores message' email system, and that you protect yourself against the sender rewriting the message after it's initially sent by sending and verifying a secure hash of the message before you show the message to the user. To follow up a recent entry, here are the social and technical issues that I can see with this hash protection scheme.

First and largest, you have a social issue; users are going to react badly to 'this message is (minorly) corrupt so I will show you nothing at all'. Users want you to recover as much of their messages as you can and show them as much as you can, but doing so weakens or destroys your protection against on the fly targeted spam and phish, client exploits, and so on.

(Trying to sell this to users as message integrity checking is going to be even less successful, because I think users care even less about that.)

Next, this scheme is more complicated than it looks because you don't want to have a 'unitary' message hash, a simple single hash that covers the entire message. The problem with a single hash is that it requires you to download the entire message before you can verify the hash and show any of it to the user, which has very bad delays with large messages (or slow network connections or both). You can deal with this by hashing the message in sections, but this probably leaves you with more client attack surface exposed (especially if you make the mistake of making the sections be the various MIME parts).

Finally, this doesn't help your security issues if clients can be fingerprinted at all; it just forces the attacker to send two messages. You use the first message to find out the client fingerprint (and to verify that you have a live user), at which point you can craft your customized exploit message and send it to the user.

Sidebar: A sectional hash scheme

I think that the best sectional hash scheme would be one that simply breaks the message up into blocks of so many Kbytes each and hashes each block separately. You still need a single final hash for the message, but you can create this by hashing the list of section hashes. Then you embed the list of section hashes early in the message header, so that the client can verify pieces independently.


Comments on this page:

From 130.217.250.13 at 2009-04-29 01:59:08:

What you're describing is a hash list, perhaps a more interesting version is a hash tree where you can send/store far less hashes, and recalculate parts as needed.

While you've made a good case against sender stored messages, what do you see as a better solution?

-- Perry Lorier

By cks at 2009-04-29 11:07:26:

I think hash lists are better for this particular situation, because you want to pack the full set of hashes into as little space as possible. My impression is that hash trees are good for bottom up verification, but for email messages we just want front to back verification.

I don't think there's a good general solution to email spam; the fundamental problem of spam is not one that is amenable to technical solutions. There are solutions for specific narrow areas of the problem, many of which people are quietly switching to now (for example, the popularity of having 'private message' features somewhere in your web-based system).

By rdump at 2009-04-29 12:39:50:

I'll hazard a guess that SMTP as currently implemented (receiver stores) is superior to any kind of sender stores scheme.

With SMTP, we have a whole lot of infrastructure in place to help us differentiate between good and bad strangers. We have ways of refusing mail from places bad strangers are known to frequent. We have ways of detecting the kinds of messages many types of bad strangers send. It's quite effective.

Since a sender stores scheme will have to re-implement all of that infrastructure, and for no practical benefit [1], a sender stores scheme is a non-starter.


[1] The bandwidth and storage saved is negligible when compared to all the other traffic sent through the intertubes and cached locally. The savings from reducing it pales in comparison to the costs of getting a sender stores system installed and working.

Written on 29 April 2009.
« One of my TDD weaknesses: mock objects for complex objects
Why I would still like MC/S in Linux »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Apr 29 01:22:28 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.