Cool URL fragments don't change either

March 2, 2014

I was all set to write an entry about how a limitation of dealing with site changes (from one domain to another, from HTTP to HTTPS, or just URL restructuring) via HTTP redirects was that URL fragments fell off during the redirections when I decided to actually check the end URLs I was getting and discovered that I was wrong. Browsers do preserve URL fragments during redirection (although you may not see this if the URL fragment is cut off because the full URL is long). What was really going on in my case is that the site I was dealing with has violated a sub-rule of 'Cool URLs don't change'.

Simply stated, the sub-rule is 'URL fragments are part of the URL'. Let me rephrase that:

If you really care about cool URLs, you can't change any HTML anchors once you create them.

The name of the anchor must remain the same and the meaning (the place it links to) must also stay. This is actually a really high bar, probably an implausibly high one, since HTML anchors are often associated with very specific information that can easily become invalid or otherwise go away (or simply be broken out to a separate page when it becomes too big).

Note that this implies that simply numbering your HTML anchors in sequential order is a terrible thing to do unless you can guarantee that you're not going to introduce or remove any sections, subsections, etc. It's much better to give them some sort of name. Effectively the anchor text should be seen as a unique and stable permanent identifier. Again this is a pretty high bar and is probably going to cause you heartburn if you try to really carry it out.

This somewhat tilts me towards a view that HTML anchors should be avoided. On the other hand it's often easier to read a large page of lots of information (exactly the situation that calls for HTML anchors and an index at the top) rather than keep clicking through a lot of small pages. Today's side moral is that web page design can be hard.

(I'd say that the right answer is small pages with some JavaScript so that one page seamlessly transitions to the next one as you read without you having to do anything, but even that's not a complete solution since you don't get things like 'search in (full) page'.)

I suppose what I really wish is that web servers got URL fragments in the URL of the HTTP request but normally ignored them. Then they could pay attention to the fragment identifier when doing redirects and do the right thing if they had any opinions about it. But this is a wish that's a couple of decades too late; I might as well wish for pervasive encryption while I was at it.

Written on 02 March 2014.
« Yet another problem with configuration by running commands
Googlebot is now aggressively crawling syndication feeds »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Mar 2 01:59:45 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.