A subtle advantage of generating absolute path URLs during HTML rendering

February 22, 2014

If you're writing a multi-page web application of some sort, sooner or later you'll want to turn some abstract name for another page into the URL for that page, or more exactly into a URL that you can put into a link on the current page. For a non-hypothetical example you might be writing a wiki or a blog engine and linking one entry to another one. When you're doing this, a certain sort of person will experience a little voice of temptation urging them to be clever and generate relative paths in those URLs. After all if you're rendering /a/path/page1 and linking to /a/path/page2 you can simply generate a '<a href="page2">' for your link instead of putting the whole absolute path in.

(And this sort of cleverness appeals to any number of programmers.)

The obvious reason not to do this is that it's more work. Your code almost certainly already has to be able to generate the absolute URLs for pages, while converting those absolute URLs to relative ones will take additional code. So let's assume that you have a library that will do this for free. Generating relative URLs is still a bad idea because of what it does to your (potential) caching.

A HTML fragment with absolute path URLs is page-independent; it can be included as-is anywhere on your site and it will still work. But a HTML fragment with relative path URLs is page-dependent. It works only on a specific page and can't be reused elsewhere, or at least it can only be reused in certain select other pages, not any arbitrary page. Relative path URLs require more cache entries; instead of caching 'HTML fragment X', you have to cache 'HTML fragment X in the context of directory Y' (and repeat for all different Ys you have). Some web apps have a lot of such directories and thus would need a huge number of such cache entries. Which is rather wasteful, to put it one way.

This is one of those fortuitous design decisions that I stumbled into back at the start of writing DWiki. I made it due to laziness (I didn't want to write something to relativize links, however nifty it would have been) but it turned out to be an excellent idea due to the needs of caching.

(Note that in most blog engines, one sort of 'HTML fragments' that you will be reusing is blog entries or at least their leadin text. Blogs typically have lots of places where entries appear.)

Comments on this page:

Another reason absolute URLs are fine is that they're not as bandwidth-costing as you might think, after gzip: http://www.jefftk.com/p/repeated-html-text-is-cheap

Written on 22 February 2014.
« You should segregate different traffic to different mailing lists
The problem with indentation in programming languages »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Feb 22 00:12:00 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.