Wandering Thoughts archives

2025-01-16

Some stuff about how Apache's mod_wsgi runs your Python apps (as of 5.0)

We use mod_wsgi to host our Django application, but if I understood the various mod_wsgi settings for how to run your Python WSGI application when I originally set it up, I've forgotten it all since then. Due to recent events, exactly how mod-wsgi runs our application and what we can control about that is now quite relevant, so I spent some time looking into things and trying to understand settings. Now it's time to write all of this down before I forget it (again).

Mod_wsgi can run your WSGI application in two modes, as covered in the quick configuration guide part of its documentation: embedded mode, which runs a Python interpreter inside a regular Apache process, and daemon mode, where one or more Apache processes are taken over by mod_wsgi and used exclusively to run WSGI applications. Normally you want to use daemon mode, and you have to use daemon mode if you want to do things like run your WSGI application as a Unix user other than the web server's normal user or use packages installed into a Python virtual environment.

(Running as a separate Unix user puts some barriers between your application's data and a general vulnerability that gives the attacker read and/or write access to anything the web server has access to.)

To use daemon mode, you need to configure one or more daemon processes with WSGIDaemonProcess. If you're putting packages (such as Django) into a virtual environment, you give an appropriate 'python-home=' setting here. Your application itself doesn't have to be in this venv. If your application lives outside your venv, you will probably want to set either or both of 'home=' and 'python-path=' to, for example, its root directory (especially if it's a Django application). The corollary to this is that any WSGI application that uses a different virtual environment, or 'home' (starting current directory), or Python path needs to be in a different daemon process group. Everything that uses the same process group shares all of those.

To associate a WSGI application or a group of them with a particular daemon process, you use WSGIProcessGroup. In simple configurations you'll have WSGIDaemonProcess and WSGIProcessGroup right next to each other, because you're defining a daemon process group and then immediately specifying that it's used for your application.

Within a daemon process, WSGI applications can run in either the main Python interpreter or a sub-interpreter (assuming that you don't have sub-interpreter specific problems). If you don't set any special configuration directive, each WSGI application will run in its own sub-interpreter and the main interpreter will be unused. To change this, you need to set something for WSGIApplicationGroup, for instance 'WSGIApplicationGroup %{GLOBAL}' to run your WSGI application in the main interpreter.

Some WSGI applications can cohabit with each other in the same interpreter (where they will potentially share various bits of global state). Other WSGI applications are one to an interpreter, and apparently Django is one of them. If you need your WSGI application to have its own interpreter, there are two ways to achieve this; you can either give it a sub-interpreter within a shared daemon process, or you can give it its own daemon process and have it use the main interpreter in that process. If you need different virtual environments for each of your WSGI applications (or different Unix users), then you'll have to use different daemon processes and you might as well have everything run in their respective main interpreters.

(After recent experiences, my feeling is that processes are probably cheap and sub-interpreters are a somewhat dark corner of Python that you're probably better off avoiding unless you have a strong reason to use them.)

You normally specify your WSGI application to run (and what URL it's on) with WSGIScriptAlias. WSGIScriptAlias normally infers both the daemon process group and the (sub-interpreter) 'application group' from its context, but you can explicitly set either or both. As the documentation notes (now that I'm reading it):

If both process-group and application-group options are set, the WSGI script file will be pre-loaded when the process it is to run in is started, rather than being lazily loaded on the first request.

I'm tempted to deliberately set these to their inferred values simply so that we don't get any sort of initial load delay the first time someone hits one of the exposed URLs of our little application.

For our Django application, we wind up with a collection of directives like this (in its virtual host):

WSGIDaemonProcess accounts ....
WSGIProcessGroup accounts
WSGIApplicationGroup %{GLOBAL}
WSGIScriptAlias ...

(This also needs a <Directory> block to allow access to the Unix directory that the WSGIScriptAlias 'wsgi.py' file is in.)

If we added another Django application in the same virtual host, I believe that the simple update to this would be to add:

WSGIDaemonProcess secondapp ...
WSGIScriptAlias ... process-group=secondapp application-group=%{GLOBAL}

(Plus the <Directory> permissions stuff.)

Otherwise we'd have to mess around with setting the WSGIProcessGroup and WSGIApplicationGroup on a per-directory basis for at least the new application. If we specify them directly in WSGIScriptAlias we can skip that hassle.

(We didn't used to put Django in a venv, but as of Ubuntu 24.04, using a venv seems the easiest way to get a particular Django version into some spot where you can use it. Our Django application doesn't live inside the venv, but we need to point mod_wsgi at the venv so that our application can do 'import django.<...>' and have it work. Multiple Django applications could all share the venv, although they'd have to use different WSGIDaemonProcess settings, or at least different names with the same other settings.)

python/ModWsgiHowAppsRun written at 23:13:25;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.