My mistake with the
Host: HTTP header
One of the nice things about writing a blog is getting to say 'oh, oops, I was a dumbass, let me fix that'. Today I have to own up to a big example of this.
In theory the absolute URL should include the port (unless it's the default). In practice, every program I've tried gleefully adds the port itself if it is a non-standard port and you're referring to the same hostname.
I was a moron.
Host: header in HTTP requests includes the port when the port is
a non-standard one (and some programs throw it in even when you're on
port 80, as I found out later). My code looked more or less like:
newuri = "http://%s:%d" % (HostHeader, MyPort) + relUrl
When programs gave me real
Host: headers, where
both hostname and port, I effectively doubled the port and things
naturally exploded. Had I printed the actual
Host: header that
programs were handing DWiki I would have seen my mistake immediately,
but instead I was too confidant that I knew what was going on and didn't
bother; I trusted my testing with hand-crafted HTTP requests, where I'd
Host: header wrong and so the result looked right.
I only found all this out months later when I was doing something else
Host: header that blew up because I didn't know to expect the
':port' on the end; that time I dumped debugging information, partly
because the failure was more mysterious.
My mistake is all the more embarrassing because, contrary to what I wrote in the original entry, the proper behavior is described in black and white in the HTTP 1.1 RFC's section on the Host header. I am not sure what RFCs I read at the time of the original entry, but evidently I didn't read the important one.