My view of setting up sane web server application delegations
One of the things that drives the appeal of CGI scripts is the easy deployment story compared to, say, a
bunch of programs that implement web applications is the deployment
configuration problem. When you have a bunch of
applications on the same machine (and using the same IP), you need
a central web server to take incoming requests and dispatch them
out to the appropriate individual web app (based on incoming host
and URL), and this central web server needs to be configured somehow.
So how do you do this in a way that leads to easy, sane deployment,
especially in a multiple user environment where not everyone can
sit there editing central configuration files as
My views on this have come around to the idea that you want some
equivalent of Apache per-directory
.htaccess files in a directory
structure that directly reflects the hosts and URLs being delegated.
There are a couple of reasons for this.
First, a directory based structure creates natural visibility and
enforces single ownership of URL and host delegations. If you own
www.fred.com/some/url, then you are in charge of that URL and
everyone underneath it. No one can screw you up by editing a big
configuration file (or a set of them) and missing that
has already been delegated to you somewhere in hundreds of lines
of configuration setup; your ownership is sitting there visible in
the filesystem and taking it over means taking over your directory,
which Unix permissions will forbid without
Second, using some equivalent of
.htaccess files creates delegation
of configuration and control. Within the scope of the configuration
allowed in the
.htaccess equivalent, I don't need to involve a
sysadmin in what I do to hook up my application, control access to
it, have the native master web server handle some file serving for
me, or whatever. Of course the minimal approach is to support none
of this in the master server (with the only thing the
equivalent can do is tell the master server how to talk to my web
app process), but I think it's useful to do more than that. If
nothing else, directly serving static files is a commonly desired
.htaccess is massively powerful here, which makes it quite
useful and basically the gold standard of this. Many master web
servers will probably be more minimal.)
To the extent that I can get away with it, I will probably configure all of my future Apache setups this way (at least for personal sites). Unfortunately there are some things you can't configure this way in Apache, often for good reason (for example, mod_wsgi).
(This entry is inspired by a Twitter conversation with @eevee.)
Sidebar: doing this efficiently
Some people will quail at the idea of the master web server doing a whole series of directory and file lookups in the process of handling each request. I have two reactions to this. First, this whole idea is probably not appropriate for high load web servers because on high load web servers you really want and need more central control over the whole process. If your web server machine is already heavily loaded, the last thing you want to do is enable someone to automatically set up a new high-load service on it without involving the (Dev)Ops team.
Second, it's possible to optimize the whole thing via a process of registering and (re)loading configuration setups into the running web server. This creates the possibility of the on-disk configuration not reflecting the running configuration, but that's a tradeoff you pretty much need to make unless you're going to be very restrictive. In this approach you edit your directory structure and then poke the web server with some magic command so that it takes note of your change and redoes its internal routing tables.