Websites should not accept random parameters in requests

September 20, 2007

One of the things that always appalls me is how permissive and accepting most web sites are about what query parameters show up in HTTP requests. Most web servers will happily serve a static file even for a URL with query parameters, despite the query parameters being meaningless, and many web applications will accept requests with extra parameters.

At a minimum, such requests are an indication that something funny is going on, whether this is that someone has made a mistake with their URL, or someone is trying to create many URLs that point to the same thing, or a client program has goofed up. At the worst, they are a sign that someone is actively trying to attack your application by seeing if they can set options that your code doesn't expect.

The usual argument against this is the security one; given that you can never trust network input, the last thing that you should be is accepting and forgiving about it; if there are things wrong, you should not proceed as if all was normal. The usual counter-argument is Postel's Law, that you should be liberal in what you accept. But being liberal here is not doing real clients any favours; in fact, you are violating robustness (and the other side of Postel's Law, to be conservative in what you do).

From the robustness view, you must assume that the clients are including the extra parameters because the clients expect them to do something. However, you don't know what they're supposed to do; if you proceed anyways, you're deciding that the difference between your behavior and whatever behavior the client was asking for is unimportant.

This is at best a somewhat questionable assumption and certainly not the conservative option. The plain fact is that you don't actually know how to handle the client's request, so you are making a guess and going with it instead of signaling an actual problem.

(In my opinion, even if you have some reason to expect clients to tack on meaningless query parameters you should react not by serving the URL but by giving clients a redirection to the real canonical URL, in the same way that leaving the trailing '/' off a directory's URL gets you a redirection instead of the directory's contents.)

Written on 20 September 2007.
« The benefit of chronological blog navigation
An interesting bind(2) failure »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Sep 20 23:26:50 2007
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.