Overcoming the drawbacks of preforking
I was going to say that while preforking
accept()-based servers don't
in practice have a thundering herd
problem, they do have two other issues, namely there's no way to tell
when you need to grow the number of processes you're using and no way to
tell if a process has frozen. (The more complicated preforking scheme has neither problem.)
However, some thought showed me that it's possible to get around these
problems while retaining the advantages of the pure
scheme. The key change to the simple version is that each child
tells the master process when it goes idle and when it handles a new
connection (doing so through a pipe that the master process set up for
this purpose). Since it sees state transitions, the master process can
now easily keep track of when a child has been busy on a single request
for too long and kill it.
(Things will be more reliable under load if the child sends a timestamp in its messages, since the master may not process child messages immediately.)
Deciding when to reduce the work pool is relatively simple; the master process can keep a count of the minimum number of idle workers over the last N seconds. When this number gets high enough, it can either not restart workers when they exit after handling N requests or outright ask them to die (via a signal, for example).
Deciding when to spin up more workers is more challenging. The only
approach I can think of is for the master process to monitor the server
socket; if the socket has stayed
accept()-able for some length of time
and no worker process has changed its state, you start another one.
(The simpler approach to this is just to say that processes are cheap and so you will always start your final pool size, instead of trying to grow and shrink the worker pool dynamically. This didn't make sense for Apache but probably does these days for a backend server that is not worrying about talking to slow clients.)
What version of Python is included in various current OSes
For my own curiosity, here is a rundown of what version of Python is in various current OS distributions, along with whether or not a version of Python 3 is available as an optional package.
(The version of Python is my best guess at what you get if you run
python' at a command line.)
|Optional Python 3?
|Solaris 10 update 8
|2.4.4 (2.6 available in Blastwave)
|Red Hat Enterprise 5
(and CentOS 5)
|No (it's not in EPEL)
|Ubuntu 8.04 LTS
|Debian 5.0 (Lenny)
|No (but there's a version in 'experimental')
|No (but it will be in Fedora 13)
|2.6.2 (I think)
|Mac OS X 10.4.11 (Tiger)
|Mac OS X 10.5.8 (Leopard)
|Mac OS X 10.6 (Snow Leopard)
(I apologize if I have slighted your favorite OS or Linux distribution; this is the subset of things that I either have machines running or know how to check. Feel free to add data in the comments.)
I care about long term supported OSes like RHEL and Ubuntu LTS because those are what we run. The front runner short-term OSes at least show where the wind is blowing for their longer-term compatriots, so it's pretty sure that the next version of Ubuntu LTS will have Python 2.6.x (or better) and some version of Python 3, and it's likely that the next version of RHEL will have 2.6.x+ and Python 3 as well.
(Solaris 10 is unlikely to ever update its version of Python, because Solaris 10 pretty much never updates anything. And no one has any idea at this point if there will be a 'Solaris 11' and if so, what it will look like or have.)
My understanding is that you need Python 2.6 if you're even going to start developing for a future Python 3 migration. Obviously having an optional version of Python 3 is even better. The slow uptake of new Python 2.x versions (and the very slow addition of optional Python 3 packages) is one reason that I am not very sanguine about Python 3's chances of general adoption any time soon, or about plans to stop developing Python 2.x.