A cheap caching trick with a preforking server in Python
When the load here climbs, DWiki (the software behind this blog) transmogrifies itself into an SCGI based preforking server. I'm always looking for cheap ways to speed DWiki up for Slashdot style load surges (however unlikely it is that I'll ever need such tuning), and it recently occurred to me that there was an obvious way to exploit a preforking server: cache rendered pages in memory in each preforked process. Well, not even rendered pages; the simplest way to implement this is to cache your response objects.
(DWiki already has various layers of caching, but its page cache is disk based. A separate cache has various advantages (such as cache sharing between preforked instances) and a disk based cache means that you don't have to worry about memory exhaustion, only disk space, but both aspects slow the cache down.)
A simple brute force in-memory cache like this has a number of attractions. Caching ready to use response objects (combined with simple time-based invalidation) means that this cache is about as fast as your application will ever go. It's quite simple to add to your application, especially if your application already has the concept of a flexible processing pipeline; you can just add a request-stealing step early on, and cache the response objects that you're already bubbling up through the pipeline. Assuming that you're having processes exit after handling some moderate number of requests, using a per-process cache creates a natural limit on any inadvertent cache leaks, memory usage, and cache expiry and invalidation issues; after not too long the entire process goes away, caches and all.
(You can also size the cache quite low; you might make it one tenth or one fifth the number of requests that a single process will serve before exiting. A large cache is obviously relatively pointless; as the cache size rises, the number of cache hits that the 'tail' of the cache can ever have drops.)
Adding such an in-memory cache to the preforking version of DWiki
did expose one assumption that I was making. For this cache to work,
response objects have to be immutable after they are finished being
generated. It turned out that DWiki's code for conditional
by directly mutating response objects; when I added response object
caching this resulted in a very odd series of HTTP responses that were
GET replies and half regular replies. I had a certain
amount of head-scratching confusion until I worked out what was going
on and why, for example, I was seeing 304 responses with large response