The Python Gilectomy project's performance problem
After my recent entries on (C)Python's Global Interpreter Lock (GIL), Kevin Lyda asked me on Twitter if I'd read the latest update on Progress on the Gilectomy. For those of us who either haven't heard of this before or have forgotten about it, the 'Gilectomy' is Larry Hastings' ongoing project to remove the GIL from CPython. As summarized in the LWN article, his goals are:
[Larry Hastings] wants to be able to run existing multi-threaded Python programs on multiple cores. He wants to break as little of the existing C API as possible. And he will have achieved his goal if those programs run faster than they do with CPython and the GIL—as measured by wall time.
With a certain amount of hand waving, these are worthwhile goals for multi-threaded Python code and I don't think anyone would object. The project is also technically neat and interesting, although it appears to be more interesting from an engineering perspective (taking existing research and applying it to Python) than from a research perspective (coming up with new GC approaches). There's nothing wrong with this, of course; computer science stands on the shoulders of plenty of giants, so it's to our benefit to look down on a regular basis. If Larry Hastings can deliver his goals in a modified version of CPython that is solidly implemented, he'll have done something pretty impressive.
However, I agree with Guido van Rossum's view (as reported in LWN's 2016 Gilectomy article) that this should not become part of regular CPython unless existing single-threaded Python programs still run at full speed, as fast as they do now. This may seem harsh, given that a successful Gilectomy would definitely speed up multi-threaded programs, but here is where theory runs up against the reality of backwards compatibility.
The reality is that most current Python programs are single-threaded programs or ones that are at most using threading for what the GIL makes it good for. This is a chicken and egg issue, of course; the GIL made Python only good for this, so this is mostly how people wrote Python programs. You have to be at least somewhat perverse to write multi-threaded code knowing that multi-threading isn't really helping you; unsurprisingly, not many people are perverse and so probably not much of this code exists. If a new version of CPython sped up this uncommon 'perverse' code at the expense of slowing down common single-threaded code, most people would probably have their Python code slow down and would consider the new version a bad change.
I further believe that this holds not just for wall clock time but also for CPU usage. A version of CPython that required extra cores to keep existing programs running just as fast in wall clock time is a version that has slowed down in practice, because not all systems actually have those extra cores sitting idle, waiting to be used.
(There are also pragmatic issues with the CPython API for C extensions. For a start, you had better make it completely impossible for existing C extensions to create multi-threaded races if they are just recompiled for the new CPython. This may require deliberately dropping certain existing APIs because they cannot be made multi-thread safe and forcing extensions to be updated to new ones, which will inevitably result in a bunch of existing extensions never getting updated and users of them never updating either.)
On the whole I'm not optimistic for a Gilectomy to arrive before a hypothetical 'Python 4' that can break many things, including user expectations (although I'd love to be surprised by a Gilectomy that keeps single-threaded performance fully intact). I further suspect that Python's developers don't have much of an appetite for going through a Python 3 experience all over again, at least not any time soon. It's possible that a Gilectomy'd CPython could be maintained in parallel to the current regular CPython, which would at least give users a choice.
(Years ago I wrote an entry in praise of the GIL and I still basically stand by those views even in this increasingly multi-core world. Shared memory multi-threaded programming is still a bad programming model, even if CPython guarantees that your program will never dump core. But at this point it's probably too late to add a different model to multi-threaded (C)Python.)