A mod_wsgi problem with serving both HTTP and HTTPS from the same WSGI app

May 24, 2015

This is kind of a warning story. It may not be true any more (I believe that I ran into this back in 2013, probably with a 3.x version of mod_wsgi), but it's probably representative of the kind of things that you can run into with Python web apps in an environment that mixes HTTP and HTTPS.

Once upon a time I tried converting my personal site from lighttpd plus a CGI based lashup for DWiki to Apache plus mod_wsgi serving DWiki as a WSGI application. At the time I had not yet made the decision to push all (or almost all) of my traffic from HTTP to HTTPS; instead I decided to serve both HTTP and HTTPS along side each other. The WSGI configuration I set up for this was what I felt was pretty straightforward. Outside of any particular virtual host stanza, I defined a single WSGI daamon process for my application and said to put everything in it:

WSGIDaemonProcess cspace2 user=... processes=15 threads=1 maximum-requests=500 ...
WSGIProcessGroup cspace2

Then in each of the HTTP and HTTPS versions of the site I defined appropriate Apache stuff to invoke my application in the already defined WSGI daemon process. This was exactly the same in both sites, because the URLs and everything were the same:

WSGIScriptAlias /space ..../cspace2.wsgi
<Directory ...>
   WSGIApplicationGroup cspace2
   ...

(Yes, this is what is by now old syntax and may have been old even back at the time; today you'd specify the process group and/or the application group in the WSGIScriptAlias directive.)

This all worked and I was happy. Well, I was happy for a while. Then I noticed that sometimes my HTTPS site was serving URLs that had HTTP URLs in links and vice versa. In fact, what was happening is that some of the time the application was being accessed over HTTPS but thought it was using HTTP, and sometimes it was the other way around. I didn't go deep into diagnosis because other factors intervened, but my operating hypothesis was that when a new process was forked off and handled its first request it then latched whichever of HTTP or HTTPS the request had been made through and used that for all of the remaining requests it handled.

(This may have been related to my mistake about how a WSGI app is supposed to find out about HTTP versus HTTPS.)

This taught me a valuable lesson about mixing WSGI daemon processes and so on across different contexts, which is that I probably don't want to do that. It's tempting, because it reduces the number of total WSGI related processes that are rattling around my systems, but even apart from Unix UID issues it's clear that mod_wsgi has a certain amount of mixture across theoretically separate contexts. Even if this is a now-fixed mod_wsgi issue, well, where there's one issue there can be more. As I've found out myself, keeping things carefully separate is hard work and is prone to accidental slipups.

(It's also possible that this is a mod_wsgi configuration mistake on my part, which I can believe; I'm not entirely sure I understand the distinction between 'process group' and 'application group', for example. The possibility of such configuration mistakes is another reason to keep things as separate as possible in the future.)

Written on 24 May 2015.
« The right way for your WSGI app to know if it's using HTTPS
Email providers cannot stop spam by scanning outgoing email »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun May 24 01:04:59 2015
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.