In practice, cool URLs change (eventually)

September 4, 2020

The idea that "cool URLs don't change" has been an article of faith for a very long time. However, at this point we have more than 20 years of experience with the web, and anyone who's been around for a significant length of time can tell you that in practice, cool URLs change all of the time (and I don't mean just minor changes like preferring HTTPS over HTTP). Over a sufficient length of time, internal site page layouts change (sometimes because URL design is hard), people move domains or hosts within a domain, and sometimes cool URLs even go away and must be resurrected, sometimes by hand (through people re-publishing and re-hosting things) and sometimes through the Wayback Machine. This decay in cool URLs is so pervasive and well recognized that we have a term for it, link rot.

(Of course, you're a good person, and your cool URLs don't change. But this is the web and we all link to each other, so it's inevitable that some other people's cool URLs that you link to will suffer from link rot.)

Despite link rot being widely recognized as very real, I think that in many way's we're in denial about it. We keep pretending (both culturally and technically) that if we wish hard enough and try hard enough (and yell at people hard enough), all important URLs will be cool URLs that are unchanging forever. But this is not the case and is never going to be the case, and it's long past time that we admitted it and started dealing with it. Whether we like it or not, it is better to deal with the world of the web as it is.

Culturally, we recite "cool URLs don't change" a lot, which makes it hard to talk about how best to evolve URLs over time, how to preserve content that you no longer want to host, and other issues like that. I don't think anyone's written a best practices document for 'so you want to stop having a web site (but people have linked to it)', never mind what a company can do to be friendly for archiving when it goes out of business or shuts down a service. And that's just scratching the surface; there's a huge conversation to be had about the web over the long term once we admit out loud that nothing is forever around here.

(The Archive Team has opinions. But there are some hard issues here; there are people who have published words on the Internet, not under CC licenses, and then decided for their own reasons that they no longer want those words on the Internet despite the fact that other people like them, linked to them a lot, and so on.)

Technically, how we design our web systems and web environments often mostly ignores the possibility of future changes in either our own cool URLs or other people's. What this means in more tangible terms is really a matter for other entries, but if you look around you can probably come up with some ideas of your own. Just look for the pain points in your own web publishing environment if either your URLs or other people's URLs changed.

(One pain point and sign of problems is that it's a thing to spider your own site to find all of the external URLs so you can check if they're still alive. Another pain point is that it can be so hard to automatically tell if a link is still there, since not all dead links either fail entirely or result in HTTP error codes. Just ask people who have links pointing to what are now parked domains.)

Written on 04 September 2020.
« Why I want something like Procmail with a dedicated mail filtering language
Some notes on what the CyberPower UPS 'Powerpanel' software reports to you »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Sep 4 00:41:56 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.