My view of setting up sane web server application delegations

May 31, 2015

One of the things that drives the appeal of CGI scripts is the easy deployment story compared to, say, a bunch of programs that implement web applications is the deployment configuration problem. When you have a bunch of applications on the same machine (and using the same IP), you need a central web server to take incoming requests and dispatch them out to the appropriate individual web app (based on incoming host and URL), and this central web server needs to be configured somehow. So how do you do this in a way that leads to easy, sane deployment, especially in a multiple user environment where not everyone can sit there editing central configuration files as root?

My views on this have come around to the idea that you want some equivalent of Apache per-directory .htaccess files in a directory structure that directly reflects the hosts and URLs being delegated. There are a couple of reasons for this.

First, a directory based structure creates natural visibility and enforces single ownership of URL and host delegations. If you own, then you are in charge of that URL and everyone underneath it. No one can screw you up by editing a big configuration file (or a set of them) and missing that /some/url has already been delegated to you somewhere in hundreds of lines of configuration setup; your ownership is sitting there visible in the filesystem and taking it over means taking over your directory, which Unix permissions will forbid without root's involvement.

Second, using some equivalent of .htaccess files creates delegation of configuration and control. Within the scope of the configuration allowed in the .htaccess equivalent, I don't need to involve a sysadmin in what I do to hook up my application, control access to it, have the native master web server handle some file serving for me, or whatever. Of course the minimal approach is to support none of this in the master server (with the only thing the .htaccess equivalent can do is tell the master server how to talk to my web app process), but I think it's useful to do more than that. If nothing else, directly serving static files is a commonly desired feature.

(Apache .htaccess is massively powerful here, which makes it quite useful and basically the gold standard of this. Many master web servers will probably be more minimal.)

To the extent that I can get away with it, I will probably configure all of my future Apache setups this way (at least for personal sites). Unfortunately there are some things you can't configure this way in Apache, often for good reason (for example, mod_wsgi).

(This entry is inspired by a Twitter conversation with @eevee.)

Sidebar: doing this efficiently

Some people will quail at the idea of the master web server doing a whole series of directory and file lookups in the process of handling each request. I have two reactions to this. First, this whole idea is probably not appropriate for high load web servers because on high load web servers you really want and need more central control over the whole process. If your web server machine is already heavily loaded, the last thing you want to do is enable someone to automatically set up a new high-load service on it without involving the (Dev)Ops team.

Second, it's possible to optimize the whole thing via a process of registering and (re)loading configuration setups into the running web server. This creates the possibility of the on-disk configuration not reflecting the running configuration, but that's a tradeoff you pretty much need to make unless you're going to be very restrictive. In this approach you edit your directory structure and then poke the web server with some magic command so that it takes note of your change and redoes its internal routing tables.

Comments on this page:

By john at 2015-05-31 19:15:39:

.htaccess files mess with performance though since if you allow it to be turned on every single file has to be parsed every single time it is served.

it also allows someone to do all kinds of funky things without you knowing about it.

Written on 31 May 2015.
« What I'm doing in reaction to Logjam (for HTTPS, SSH, and IKE)
Unix has been bad before »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun May 31 01:11:07 2015
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.