2013-07-17
Do specific HTTP error codes actually matter?
There are a few HTTP 4xx and 5xx responses that provoke specific browser reactions; the classic case is a 401 response to provoke an authentication challenge for HTTP level authentication. But apart from those specific responses, does it actually matter what error code you return for a failed or rejected HTTP request?
If it's a human making the request, the HTML of the error page they see does matter; it's what they'll read to understand what went wrong. But that error text is unrelated to the HTTP status code and in fact I'm not certain that a lot of people even notice or know what HTTP status is attached to a 'page not found' page. As far as they're concerned you could probably return a 2xx series response and the error text and it would all be the same to them.
(I hope that browsers behave a little differently for 2xx versus 4xx pages; for example, I sort of hope that browsers don't enter the latter into your browsing history as a visited link. But maybe they do.)
If it's code making the request, it's clear that returning some 4xx or 5xx error for failed requests is important so that the code can easily tell them from successful ones. But I'm not at all convinced that there's very much (general) code that cares about what the specific error return is from some random website that it is poking, as opposed to the fact that there was an error.
I have two reasons to be dubious. The first is that treating different error codes differently in your code takes additional work and also requires having something different that you can do. If a failure is a failure, well, it doesn't matter just why it failed. The second is that I doubt web sites are any more careful about which HTTP error code they generate than code is about handling them. In an environment where any old HTTP error is just as good as another one, client code can't trust a random server's error code to be meaningful in the first place.
(Error codes can be meaningful in more confined situations, where you have a specific client talking to a specific server and you know both ends.)
I certainly think it would be nice if, say, Google's web spider behaved somewhat differently in response to '410 Gone Permanently' than eg '504 Gateway Timeout'. But I'm not certain it makes sense for even Google to bother. People can misconfigure servers to generate 410s when they don't actually mean it and in the general case what matters is whether the error really is permanent instead of whether it just claims to be.
(This was partly brought to mind by the comment on my entry about banning a feed reader. By the way, that feed reader is still banned, still getting 403s, and still trying every ten minutes. That's a clear case of not paying attention to errors at all.)