The web is, in a sense, designed for serving static files

May 14, 2022

One thought I've been turning over in my mind lately is the idea that one of the reasons that it's historically been so easy to serve static files on the web is that in a sense, the web was designed for it. This was not so much through calculation as through necessity, because most or all of the early web servers served static files, especially the very first web server. This likely created a strong pressure to make HTTP friendly to static files, since anything else would require a more complicated web server for no practical gain.

(Or they almost entirely served static files. There is a provision in the very first version of HTTP for searching an index, with a presumably dynamic result generated by a program.)

The obvious area where this shows is that URL paths map directly on to Unix file paths. When I write it this way using web terms it sounds natural, but in fact the 'URL path' is really a per-server identifier for the particular page. There are a lot of ways to index and address content other than hierarchical paths, but the web picked paths instead of some other identifier, despite them not always being a good way to identify content in practice (just ask the people who've had to decide on what the proper URL structure is for a REST application).

(There are even other plausible textual representations, although paths are in some sense the most concise one. I prefer not to think about a version of the web that used something like X.5xx Distinguished Names to identify pages.)

The very first HTTP protocol is startlingly limited, although Basic HTTP followed very soon afterward with niceties like being able to return content types other than HTML. But it does have the idea of dynamic content in the form of searching on a named index, which feels explicitly designed for the index to be a program that performs the search and returns (HTML) results.

Early HTTP and HTML is so minimal that I'm not sure I could point to anything else that's biased in favor of static files. Arguably the biases are in what's left out; for example, there's no indicator to client programs that what they've fetched should be refreshed every so often. Early HTTP left that up to the users (and then Netscape added it around 1995, so people did eventually want this).

It's entirely possible that my thought here is well off the mark. Certainly if you're on a Unix workstation and designing a protocol that serves multiple pieces of content that are distinguished somehow, it's natural to use paths to specify content. As support for this naturalness, consider Gopher, which also used "a file-like hierarchical arrangement that would be familiar to users" despite having significantly different navigation.

Written on 14 May 2022.
« The cause of an odd DNF/RPM error about conflicting files
The idea of hierarchical filesystems doesn't feel like an API to me »

Page tools: View Source.
Search:
Login: Password:

Last modified: Sat May 14 22:43:06 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.