2007-09-20
Websites should not accept random parameters in requests
One of the things that always appalls me is how permissive and accepting most web sites are about what query parameters show up in HTTP requests. Most web servers will happily serve a static file even for a URL with query parameters, despite the query parameters being meaningless, and many web applications will accept requests with extra parameters.
At a minimum, such requests are an indication that something funny is going on, whether this is that someone has made a mistake with their URL, or someone is trying to create many URLs that point to the same thing, or a client program has goofed up. At the worst, they are a sign that someone is actively trying to attack your application by seeing if they can set options that your code doesn't expect.
The usual argument against this is the security one; given that you can never trust network input, the last thing that you should be is accepting and forgiving about it; if there are things wrong, you should not proceed as if all was normal. The usual counter-argument is Postel's Law, that you should be liberal in what you accept. But being liberal here is not doing real clients any favours; in fact, you are violating robustness (and the other side of Postel's Law, to be conservative in what you do).
From the robustness view, you must assume that the clients are including the extra parameters because the clients expect them to do something. However, you don't know what they're supposed to do; if you proceed anyways, you're deciding that the difference between your behavior and whatever behavior the client was asking for is unimportant.
This is at best a somewhat questionable assumption and certainly not the conservative option. The plain fact is that you don't actually know how to handle the client's request, so you are making a guess and going with it instead of signaling an actual problem.
(In my opinion, even if you have some reason to expect clients to tack on meaningless query parameters you should react not by serving the URL but by giving clients a redirection to the real canonical URL, in the same way that leaving the trailing '/' off a directory's URL gets you a redirection instead of the directory's contents.)
2007-09-19
The benefit of chronological blog navigation
However criticized, calendar-based blog navigation has a great property that insures it keeps getting used over and over: it requires no extra taxonomy work to present navigation. You do not have to organize categories or come up with tags or maintain a carefully organized site map.
But the criticism is correct too, because the default presentation of chronological blog navigation gives you useless information, unless you are dealing with a personal-life based blog when you really do want to look back to see what the author was doing on a given date. There are people that this applies to, but probably not very many.
(A calendar for posts usually gives you useless information because for most blogs, the date something is posted on tells you nothing about what it is about. Plus, many calendar widgets don't even tell you how many posts a day has. The whole thing reaches the heights of absurdity when a blog is active enough to have posts almost every day and the calendar just turns into a solid mass of links.)
Better examples of calendar-based navigation are things like previous and next entry links for individual entries (ideally giving the title of the next and previous entries, so you have as much information as possible), or a chronological listing of entries with their titles or other useful details about them.
I suspect that calendars themselves keep showing up in blog templates for a number of reasons beyond simple habit. First, a calendar (in a small font) is very compact; you can wedge one into even a small sidebar. Second, calendars are attractive feedback for the blogger, who gets to see more and more days turning bold or whatever as they update regularly. (It's also feedback if you don't update, as your calendar widget pointedly illustrates your lack of posts.)
2007-09-05
Where to find specifications on HTTP POST behavior
Some IP addresses (probably not friendly ones) have recently taken to
making POST submissions to various 'write comments' URLs here with
a Content-Type of 'application/x-www-form-urlencoded; charset=UTF-8'.
These get rejected by DWiki, because I was quite paranoid when I wrote
the POST handling code and so DWiki is quite conservative on what it
will accept.
While I was pretty certain that I wasn't losing anything by rejecting
these requests, I did get curious to find out if adding a character
set to a form POST content-type this way is actually legal, which
meant that I wanted to run down where this is actually specified.
(In general including a charset in the content-type on POST is
unambiguously allowed by the HTTP specification, so the only question is
whether you are allowed to do it specifically in HTTP form POSTs.)
The primary specification of form POST behavior is in the HTML 4.01
specification,
which should not have surprised me but did (I looked at the HTTP spec
first). Section 17.13.3 describes the process of submitting a form, but
you also need 17.13.4 and the definition of the enctype attribute.
Unfortunately this doesn't clearly answer the question, since the
specification uses very general language.
However, I think that adding a charset parameter has to be allowed by
implication. Forms may specify that the server can accept more than
one character encoding and leave it up to the client to decide which
one to use (the accept-charset <form> attribute). This implies that
the client must tell the server which character set it picked, and the
form encoding rules provide no place to put this except as a charset
parameter on the POST's Content-Type.
(Browsers are encouraged to interpret a missing accept-charset as
implying the character set of the HTML page with the form, which is
UTF-8 in the case of WanderingThoughts. However, including a charset
at all in this case is vanishingly rare.)
I'm still not going to fix DWiki's code right away, since I want to think through what I can and should do if the character set doesn't match. (Bearing in mind that my tolerance for people playing weird HTTP and HTML games is fairly low, since most of them are up to no good.)