The practical cost of forking in Python

May 10, 2006

I spent part of the other day working to speed up an SCGI based program, and wound up hitting a vivid illustration of the practical cost of forking in Python. I'll start with the numbers:

  • 5.3 milliseconds per request when the program forked a child to handle each request.
  • 1.1 milliseconds per request when the forking was stubbed out so each request ran in the main process.

Benchmarking was done with Apache's ab, running on the same machine (and with only one request at a time, since the non-forking version obviously can't handle concurrent requests).

These numbers are pure SCGI overhead; the program had its usual response handler stubbed out to a special null handler that just returned a short hard-coded response, and it was directly connected to lighttpd. (Some work suggests that most of the remaining 1.1 millisecond is in decoding the request's initial headers; I'm not sure how to speed this up.)

Since I have a thread pool package lying around, I hacked the SCGI server up to use it instead of forking; the performance stayed around 1.1 milliseconds per request, somewhat to my surprise.

I don't have any explanation of why Python takes 4.2 milliseconds more when I fork for each request. The direct cost of fork() with all of the program's modules imported is about 1.3 milliseconds (the fork tax varies with how many dynamic libraries the Python process has loaded, so it's important to measure with your program's actual set of imports). Forking does require extra management code to do things like track and reap dead children, but 2.9 milliseconds seems a bit high for it.

Written on 10 May 2006.
« Bad patch management at Sun
Another really stupid web spider »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed May 10 02:27:52 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.