2007-10-07
An improvement in my comment spam precautions
A while back, I thought that a comment spammer had broken my simple scheme to prevent people from fetching the 'add comments' page once and then using their zombie farm to submit spam, so I decided to switch to a more secure system for this. Since the weakness in my scheme was spammers being able to replace my hidden information about the IP address that originally fetched the 'add comments' page with their own, the obvious fix was to sign the IP address using HMAC.
(It turned out that the comment spammer was doing something else, and I had somewhat misread my logs.)
In theory, all I needed to put in the page was the HMAC signature for the originating IP; to validate a request I'd just compute the signature that the request should have and verify that it was the same as the signature. The drawback to this is that if verification failed, I wouldn't know much about why.
Broadly, there are three different ways things can fail to validate:
- the format of the hidden form field is invalid (in practice, the most common failure mode)
- invalid signature; the data has been tampered with
- valid signature, but not the correct data for this request; someone is trying to feed you signatures they got from somewhere else.
The first case is only interesting to see what sort of random things comment spammers will shove in a field they don't understand. There isn't much you can tell about the second case except that someone is being vaguely subtle. The third case is the interesting one, because you can find out where the comment spammers are getting the valid signatures from.
With HMAC, you split your data between the key and the message; you share the message and the signature, and keep the key private. If you want to tell the second case from the third case, the key needs to be constant across requests and variable things like the IP address have to go in the message and be embedded in the web page along with the HMAC signature.
Since I can never leave good enough alone, I also decided to make sure that the 'add comment' page had been requested recently (as opposed to, say, a week ago). This more or less requires putting the original fetch timestamp in the message as well, since it is not something I could recover from a subsequent request.
(The best I could do would be to round off the request time into, say, six hour blocks, and put the block number into the key. But that would invalidate all signatures more than six hours ago and mean that I couldn't tell a tampered message from an old message.)