The problem with preforking Python network servers
I've been thinking about ways around the practical cost of forking in Python. There's two common alternatives: preforking servers and threads in general. However, both of them have issues that make me unhappy with them.
The best setup for a preforking SCGI server is a central dispatcher that
parcels new connections out to a pool of worker processes; this requires
the ability to pass file descriptors to other processes. While Unix can
do this (with
SCM_RIGHTS messages over Unix domain sockets), Python
doesn't support this part of the Unix sockets API.
This leaves you with the preforked workers all sitting around waiting
select() for a new SCGI connection or instructions from the master
process (such as 'please exit now'). When a new SCGI connection comes
in, all of them wake up in a thundering herd; one of them wins the
accept() the new connection and everyone else goes back to
select() to wait. The more worker processes, the bigger the herd.
Pragmatically the thundering herd issue is unlikely to be noticed on a modern computer, partly because you don't want to run that many worker processes anyways. But its mere existence annoys me, and the lack of a central dispatcher means that you have to pre-start all the workers and can't start and stop them based on connection flux. (This has a silver lining: just starting a fixed number of workers and keeping them running is less code.)
I may still code a preforking version of the SCGI server just to see how it goes and for the experience, but I suspect I'm not going to run it in production. Systems speed up, but unappetizing code is forever.
The problems with threads
There are several annoyances with threads:
- I'd lose process isolation, so a code bug could rapidly contaminate the entire SCGI server.
- This isn't a good match for Python threads because my SCGI server is mostly CPU bound.
- due to the Linux NPTL thread issue the process would use up a lot of virtual memory, and it just makes me twitchy to see my SCGI server sitting around using many megabytes of virtual memory.
I could do a threaded or thread-pool based SCGI server, but I'd be left with the feeling that it was a big hack. It'd barely be a step up from a single-threaded server that only handled one connection at a time. (There's some disk IO and network IO that multiple threads might be able to take advantage, but probably not too much. Unfortunately measuring true parallelism opportunities is a bit tricky.)