Implementing a preforking network server in Python
I recently read a great explanation of the general Apache 2 preforking server model, and found it so lucid and simple that I was immediately inspired to implement a generic preforking server myself. Their approach has a bit of overhead in parent/child communication, but it does mean that you do not have to pass file descriptors between processes; all you need is some way for the parent to talk with its children.
(Since Python's standard library doesn't have support for file descriptor passing, the pragmatic advantages of this approach are immense.)
To summarize, the core of this model is that the parent is the only
thing that notices when there is a new connection, but it does not do
accept() itself; instead it picks an idle child and commands the
child to handle the connection. In implementing my version I ran into
some subtleties, so I figure I might as well write them down here for
- the server socket must be non-blocking and the entire scheme must
be able to cope with
accept()s that fail, because this can happen. If the kids ever block in
accept(), things start going horribly wrong; at a minimum you descend to a thundering herd situation.
(Remember to set the newly
accept()'d socket back to blocking; there are some platforms where it will inherit the server socket's non-blocking status.)
- the 'accept a new connection' step has to be synchronous between
the parent and the child, and the child must reply with its status
only after it has done the
accept(). Otherwise you can have a race between the parent returning to its
select()and the child pulling the new connection off; if the parent wins the race it will order more than one child to handle the same connection.
This means that there are only two times kids send asynchronous status messages to the parent: their initial 'I am idle' message on startup, and the messages they send after they've completed processing a connection.
- the parent should deal with status reports from kids before trying
to dispatch a new pending connection; this maximizes the chance of
knowing that you have idle kids.
- I found that I needed a status code from the child to the parent
to signal 'the processing function has asked the server to shut
down'. This is because the simplest place to put checks for
signals to shut down is in the application's 'new connection
handler' function, which is only called in kids.
(An alternate approach is to have a second function that's called every time through the parent's main loop.)
- because I was concerned about resilience, I added timeouts for
synchronous communication (for example, commanding a child to
accept a new connection and getting a status reply from it). If
the timeout expires without a good answer, the parent assumes
that something horrible has gone wrong and kills the child.
- asynchronously starting kids (for example to maintain a minimum
pool of idle workers) raises the issue of dealing with kids that
don't start properly for some reason. My approach is to put such
kids on a list until they report their initial 'idle' status. If the
parent needs to dispatch a new connection and there are no idle kids,
it picks a pending kid and synchronously waits for its status report;
if this times out it kills the kid and tries again.
- there are two ways of handling too many connections. The graceful
way is that if the parent cannot dispatch a new connection to a kid,
it stops checking the server socket until it has at least one idle kid;
otherwise the parent will just continually spin between
select()and failing to dispatch the new connection.
The abrupt way to handle overload is to have the parent
accept()and then immediately close the new connection if it cannot dispatch the new connection to a kid.
While the original description uses socketpairs to communicate between
the parents and kids, my implementation uses a pair of pipes because
Python 2.3.3 doesn't have
socket.socketpair() (it only appears in
I'm quite happy with how relatively easy it was to write an implementation of this approach and the performance of my code. It is now the core of DWiki's SCGI server, and has handily dealt with the performance issues of the previous approach without having any of the drawbacks of the other solutions I've considered.
(And it stood up fine to a recent pounding.)
If anyone is interested, the code is available as prefork.py, unfortunately without unit tests but hopefully with enough documentation to be usable (the comments probably need some revision). It is currently GPL2 because I know the verbiage to slap on a file for that license; I would be happy to redo it under the Python license once I know the right incantation.