Thinking about the merits of 'universal' URL structures
I am reasonably fond of my URLs here on Wandering Thoughts (although I've made a mistake or two in their design), but I have potentially made life more difficult for a future me in how I've designed them. The two difficulties I've given to a future self are that my URLs are bare pages, without any extension on the end of their name, and that displaying some important pages requires a query parameter.
The former is actually quite common out there on the Internet, as
many people consider the .html
(or .htm
) to be ugly and
unaesthetic. You can find lots and lots of things that leave off
the .html
, at this point perhaps more than leave it on. But it
does have one drawback, which is that it makes it potentially harder
to move your content around. If you use URLs that look like
'/a/b/page
', you need a web server environment that can serve
those as text/html
, either by running a server-side app (as I do
with DWiki) or by suitable server configuration so that such
extension-less files are text/html
. Meanwhile, pretty much anything
is going to serve a hierarchy of .html
files correctly. In that
sense, a .html
on the end is what I'll call a universal URL
structure.
What makes a URL structure universal is that in a pinch, pretty much any web server will do to serve a static version of your files. You don't need the ability to run things on the server and you don't need any power over the server configuration (and thus even if you have the power, you don't have to use it). Did your main web server explode? Well, you can quickly dump a static version of important pages on a secondary server somewhere, bring it up with minimal configuration work, and serve the same URLs. Whatever happens, the odds are good that you can find somewhere to host your content with the same URLs.
I think that right now there are only two such universal URL
structures; plain pages with .html
on the end, and directories
(ie, structuring everything as '/a/b/page/
'). The specific
mechanisms of giving a directory an index page of some kind will
vary, but probably most everything can actually do it.
On the other hand, at this point in the evolution of the web and
the Internet in general it doesn't make sense to worry about this.
Clever URLs without .html
and so on are extremely common, so it
seems very likely that you'll always be able to do this without too
much work. Maybe one convenient source of publishing your pages
won't support it but you'll be able to find another, or easily
search for configuration recipes on the web server of your choice
for how to do it.
(For example, in doing some casual research for this entry I
discovered that Github Pages lets you omit the .html
on URLs for
things that actually have them in the underlying repository. Github's
server side handling of this automatically makes it all work. See
this stackoverflow Q&A,
and you can test it for yourself on your favorite Github Pages site,
eg. I looked at Github Pages
because I was thinking of it as an example of almost no effort
hosting one might reach for in a pinch, and here it is already
supporting what you'd need.)
PS: Having query parameters on your URLs will make your life harder
here; you probably need either server side access to something on the
order of Apache's RewriteCond
or to add some JavaScript into all the
relevant pages that will look for any query parameters and do magic
things with them that will either provide the right page content or at
least redirect to a better URL.
(DWiki has decent reasons for using query parameters, but I feel like perhaps I should have tried harder or been cleverer.)
|
|