The modern Python web application stack (as I understand it)

January 25, 2011

Someone recently asked me what the modern Python equivalent of CGI was. The answer started out simple but the more I wrote the more complicated it got, until it wound up here.

At their base, modern Python web applications are written to an interface called WSGI, the 'Web Server Gateway Interface', a specification for the Python environment that web applications run in. WSGI is fairly low level (roughly on the level of CGI) and generally good, although it has a number of annoying corner cases that everyone ignores in practice. Frameworks such as Django are generally WSGI applications, although if you're using a framework you usually get to ignore the WSGI level entirely because the framework hides it (and you want this hiding).

(Mechanically, a 'WSGI application' is a callable Python object that your server will call with a WSGI environment on each request. The WSGI specification describes the environment your callable object is handed, what you can do with it, and how you return the headers and text of your reply.)

To actually deploy and run a WSGI app, you need something that provides a WSGI environment; call this a 'WSGI server'. In theory your webserver could do this, translating directly from HTTP to WSGI and back, but in practice this is relatively rare (although Apache's mod_wsgi is apparently well regarded). So you usually need a level of indirection; your webserver will go from HTTP to some protocol and speak that protocol to your WSGI server, which then maps it to WSGI and runs your application. Popular protocols for this are SCGI and FastCGI; SCGI is much simpler but also less popular.

(You are not restricted to these two choices; any way of talking to your web server will do. If you want to, you can use CGI (ie, running your WSGI server as a CGI-BIN, which means starting it up from scratch on every request). If you are sufficiently crazy you can add an extra level of indirection, having your web server run CGIs which talk SCGI to your SCGI-based WSGI server.)

Today I would expect that a common Python web application stack thus looks something like this:

  • the world talks HTTP to Apache or lighttpd or nginx
  • your web server (whatever it is) talks FastCGI to your WSGI server (which is running as a separate, independent process or set of processes)
  • your WSGI server constructs a WSGI environment from the HTTP request and calls your framework of choice as a WSGI application
  • your framework calls your code

Results trickle back in reverse.

WSGI is not a completely comprehensive standard; the most important thing it omits (as outside of its scope) is how WSGI applications get created and configured. The WSGI specification says that WSGI servers are handed a Python callable and they call it with a WSGI environment on each request, but it doesn't say how the callable object is created; that's up to your server. Each WSGI server may have a different answer to this, and if you're writing one from scratch (for any environment) you can pick whichever method is most convenient for you, down to hard-coding the creation of your application callable in your main() function.

(By now there may be a de facto Python standard on how WSGI servers are supposed to do this; I haven't been paying close attention for a while.)

The WSGI specification itself is short and includes quite readable example code for simple WSGI apps and a CGI-based WSGI server (well, most of it). WSGI is really not that complicated (although there are subtle corners).


Comments on this page:

From 96.235.41.115 at 2011-01-25 12:06:09:

I used mod_wsgi when I wrote a Python web app and was very favorably impressed. Runs the python in a separate process, as a UID of your choosing, so no messing around with FastCGI+suxec+something. Very easy to isolate vhosts. Controls on parallelism.

I hear you on the undefined entry point. mod_wsgi defaults to "application", with much the same API as the callback for serving; you can change it with a directive though, to match something else.

By cks at 2011-01-25 17:49:49:

It's more than just the undefined entry point, since you generally want to run code before the WSGI server starts (so that you can load your configuration once, instead of on every request). Generally you want to create some sort of closure or encapsulated callable object and pass that to the WSGI server as the 'application', and of course setting that up is completely outside of the WSGI specification.

(I suppose you could get around this with our old friend the singleton, so that it only looks like you're loading the configuration from scratch on every request.)

From 173.72.88.77 at 2011-01-27 07:38:01:

As before (yeah, DanielMartin - I lost my password for here again), I'll take issue with the "much" in "much simpler".

Granted, the fastcgi spec's clarity is hindered by specifying its structures in a kind of pseudo-C but the protocol itself is really very simple once you accept its design decision to multiplex queries/responses over one channel.

By cks at 2011-03-04 02:11:38:

I think that FastCGI is inherently more complex than SCGI in large part because it uses a much more complex wire encoding, one that requires you to write significant amounts of code. I wrote this up in WhyFastCGIIsComplex.

Written on 25 January 2011.
« Poison pills, a tale of interrupts versus highly structured systems
The various ways of writing a modern Python web app »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Jan 25 01:10:50 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.