Where to find specifications on HTTP POST behavior
Some IP addresses (probably not friendly ones) have recently taken to
making POST
submissions to various 'write comments' URLs here with
a Content-Type of 'application/x-www-form-urlencoded; charset=UTF-8'.
These get rejected by DWiki, because I was quite paranoid when I wrote
the POST handling code and so DWiki is quite conservative on what it
will accept.
While I was pretty certain that I wasn't losing anything by rejecting
these requests, I did get curious to find out if adding a character
set to a form POST
content-type this way is actually legal, which
meant that I wanted to run down where this is actually specified.
(In general including a charset in the content-type on POST
is
unambiguously allowed by the HTTP specification, so the only question is
whether you are allowed to do it specifically in HTTP form POST
s.)
The primary specification of form POST behavior is in the HTML 4.01
specification,
which should not have surprised me but did (I looked at the HTTP spec
first). Section 17.13.3 describes the process of submitting a form, but
you also need 17.13.4 and the definition of the enctype
attribute.
Unfortunately this doesn't clearly answer the question, since the
specification uses very general language.
However, I think that adding a charset parameter has to be allowed by
implication. Forms may specify that the server can accept more than
one character encoding and leave it up to the client to decide which
one to use (the accept-charset
<form> attribute). This implies that
the client must tell the server which character set it picked, and the
form encoding rules provide no place to put this except as a charset
parameter on the POST
's Content-Type.
(Browsers are encouraged to interpret a missing accept-charset
as
implying the character set of the HTML page with the form, which is
UTF-8 in the case of WanderingThoughts. However, including a charset
at all in this case is vanishingly rare.)
I'm still not going to fix DWiki's code right away, since I want to think through what I can and should do if the character set doesn't match. (Bearing in mind that my tolerance for people playing weird HTTP and HTML games is fairly low, since most of them are up to no good.)
Comments on this page:
|
|