Who or what your website is for and more on HTTP errors

August 5, 2013

Aristotle Pagaltzis commented on my entry about the pragmatics of HTTP errors and I want to reply to a few things.

First off, I want to say that I fully agree with Aristotle's characterization that the real question to ask is what the practical effects of using any particular status code will be. This is an excellent way of putting it and if I'd been clever enough to think of it I would have framed my entire (accidental) series around this question.

A "website used by people" is really a discoverable, self-documenting web API. The distinction isn't between whether they are used by people or programs, it is really only whether the primary generated content format is more human- or more machine-friendly (HTML vs JSON, say).

I disagree with this on a philosophical basis. A website used by people can be treated as a discoverable web API but it is not one; it has not been designed as one and it probably won't evolve as one. To put it one way, people will read and machines won't. A real web API needs machine parseable results (including HTTP error codes), stability, versioning, and a bunch of other things. A website designed for people is unlikely to have those (for good reasons).

(Yes, search engines parse HTML and that's a good thing. But I think that this is worlds away from an actual API.)

I think that this distinction is important to draw because it drastically shapes how your web application responds to errors (at least for general errors). If you really are creating an API then you need to somehow make the responses machine-parseable and unambiguous, which may even require making up your own new HTTP error codes (or the less extreme version of embedding an additional status header with more details in the HTTP response). If you're creating a web application for people what matters is what people will read; actual HTTP error codes are important only for their effects (if any) on caches, web crawlers, and so on if you care about any of those.

(You may not. A web application that is used over HTTPS and only interacts with authenticated users makes caches and web crawlers irrelevant.)

In response to my hypothetical of getting a DELETE for a non-existent URL when your application doesn't even support DELETE, Aristotle gave the web-standards-correct answer:

There is no resource at the URL in the request, so 405 would be sort of perverse to respond with. I'd argue against 403 on similar grounds but couldn't really object to it. Between 404 and 501 it's a toss-up though.

Here is where web standards run into security engineering. If your application doesn't support DELETE at all, the wise thing to do is to reject all DELETE requests out of hand before you attempt to parse the URL, decode query arguments, and so on. Often this is also by far the easiest thing. There is also a strong argument that the specific error code chosen (and the text that accompanies the HTTP response) should be as uninformative as possible, since anyone who tries a DELETE against your web app is trying a destructive operation that you do not support at all.

In general when web clients are attempting something which you don't support, have never advertised, and that can't be an innocent mistake I think that you have almost completely free license to do whatever is convenient from a programming or security standpoint. People who are trying to rattle the doorknobs or even kick the door in do not get any courtesies.

Written on 05 August 2013.
« What's changed in Unix networking in the last decade or so
Understanding how generators help asynchronous programming »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Aug 5 22:28:50 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.