Wandering Thoughts archives


Cool URLs keep their contents (an obvious point that still matters)

It has been an article of faith for some time that "cool URLs don't change", as seen in (for example) this 1998 W3C article written by Tim Berners-Lee (you may have heard of him). Sadly, time has shown that cool URLs do often change, and sometimes they become inaccessible even when they don't change. But there is another part of this that I feel the need to say out loud: cool URLs keep their contents.

It's not enough for a 'cool URL' to still resolve and not give you a bad response. What we actually care about is not the URL, it's what we find at the URL, and it's that content that we want to still be there, to 'not change'. An old URL that now returns a HTTP 2xx response with 'there is nothing here' is 'unchanged' in some technical sense, but it has changed in a practical one. Similarly, generally we think of a cool URL as relatively unchanged even if it gives us a redirection to the new home of contents (especially if the redirection is merely from HTTP to HTTPS). What we care about is that we can get the same content by going to the same URL.

(You can argue about this for URLs that are only fetched by programs, but my view is that 'cool URLs don't change' is mostly about how it's seen by people. If you have an API or a fixed resource, you may have all sorts of requirements around redirections (to the same site or others), TLS versions for HTTPS, and so on.)

What is considered 'the content of a URL' is a situational thing, and not something that can be evaluated with hard and fast rules. The URL of a blog home page or an Atom syndication feed generally changes the textual contents on a regular basis, but pretty much everyone would say that the URL hasn't changed in the sense of the W3C article; you still get the same sort of thing by going to the same URL. A wiki page may be substantially overhauled and rewritten, but generally people will still consider it to have not changed in the cool URL sense.

(Although syndication feed URLs are often fetched by programs, my view is that 'change' in the URLs has to be viewed through the lens of people and browsers. A syndication feed reader needs to be able to cope with things like HTTP redirections, especially from HTTP to HTTPs.)

Unfortunately this means that some websites we'd like to have cool URLs don't always keep them that way. One unfortunate example is Wikipedia. For fairly rational reasons, Wikipedia keeps tinkering with the names (and thus the URLs) of entries, as well as their internal HTML anchors (which have URLs associated with them). Over the long term, links to Wikipedia entries aren't necessarily stable.

(A lot of Wikipedia entries do have stable URLs in practice, but not all of them. Links to anchors in Wikipedia entries seem to be less stable, at least for the sort of them that I use here on Wandering Thoughts.)

(This elaborates a bit on a Fediverse post.)

web/CoolUrlsKeepTheirContents written at 21:58:26; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.