Some observations from playing with PyPy on DWiki

December 12, 2013

DWiki is the software behind Wandering Thoughts. It makes a convenient test platform for me to experiment with PyPy because it's probably the most CPU-intensive Python code I have anything to do with and also the potentially longest-running program I have, which turns out to be very important for PyPy performance. In the process of doing this today I've wound up with some observations.

(All of these are against PyPy 2.1.0 on a 64-bit Fedora 19 machine.)

My first discovery that it can be relatively hard to make a relatively optimized program descend into true CPU crunching of the sort that PyPy theoretically drastically accelerates. DWiki has significant amounts of caching that try to avoid (theoretically) expensive operations like turning DWikiText into HTML, and in normal operation these caches are hit all of the time. PyPy doesn't seem to be able to do anything too impressive with what's left.

(In reading PyPy performance documentation I see that I'm probably also getting hit by bad PyPy performance on cPickle, as DWiki's caches are pickle-based.)

When I bypassed some of this caching so that my Python code was doing a lot more work, I got confirmation of what I already sort of knew: PyPy required a lot of warmup before it performed well. And by 'performed well' I mean 'ran at least as fast as CPython'. In my code on a very low level operation (simply converting DWikiText to HTML, without any caches), PyPy needed hundreds of repeats of warmup before it crossed over to being faster than CPython. This general issue is common for tracing JITs, but I didn't expect it to be so large for PyPy. CPython has flat performance, of course. The good news is that on this low level task PyPy does eventually wind up faster than CPython (although it's hard to say how much faster; my test framework may over-specialize the generated code at present).

(This warmup issue has significant implications for preforking network servers. You likely need to have any given worker process handle quite a lot of requests before PyPy is at all worth it, and that raises concerns with slow memory leaks and so on.)

So far I have only talked about CPU usage and haven't mentioned memory usage. There's a good reason for that: for DWiki, PyPy's memory usage is through the roof. My test setup consistently has CPython at around 13 MB of active RAM (specifically RSS). PyPy doing the same thing takes anywhere from 70 MB to 130 MB depending on exactly what I'm testing. In many situations today this is a relative killer (especially again if you're dealing with a preforking network server, since PyPy memory usage seems to grow over time and that implies every child worker process will have its own copy).

My overall observation from this is unsurprising and unexciting, namely that PyPy is not a drop in magic way of speeding up my code. It may work, it may not work, it may require code changes to work well (and right now the tools for finding what code to change are a bit lacking), and I won't know until I try. Unfortunately all of this uncertainty reduces my interest in PyPy.

(I have seen PyPy clearly beat Python, so it can definitely happen.)

Written on 12 December 2013.
« The problem with nondeterministic garbage collection
Using cgroups to limit something's RAM consumption (a practical guide) »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Dec 12 02:04:17 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.