Safely using Python's Global Interpreter Lock is quite tricky and subtle

May 22, 2017

Recently I ran across Grok the GIL: How to write fast and thread-safe Python (via), by A. Jesse Jiryu Davis. In the original version of the article, A. Jesse wrote:

Even though the line lst.sort() takes several steps, the sort call itself is a single bytecode, and thus there is no opportunity for the thread to have the GIL seized from it during the call. We could conclude that we don't need to lock around sort().

In comments on that article, Ben Darnell pointed out that this is not necessarily true. The fundamental issue is that Python objects are allowed to customize how they are compared for sorting, hashed for things like dictionary insertions, and so on. These operations necessarily require running Python code, and the moment you start running Python code the GIL may be seized from you.

The version of the article on A. Jesse's personal site has been updated to note some qualifications:

[...] so long as we haven’t provided a Python callback for the key parameter, or added any objects with custom __cmp__ methods. [...]

If you guessed that these qualifications are not quite complete, you would be correct; they're missing the rich comparison operations, which may be defined instead of __cmp__ (and in Python 3, they're the only way to customize comparisons and ordering).

This may seem like nit-picking and in a sense it is. But what I really want to illustrate here is just how tricky it is to safely use the GIL without locking. In CPython there is a huge number of special cases where Python code can start running in what appear to be simple atomic C-level operations like lst.sort(). Knowing these special cases requires a great deal of up to date Python knowledge (so you know all the various ways things can be customized), then finding out if they're going to affect any particular operation you want to do may require deep dives into the source code you're dealing with. This is especially the case if the code uses things like metaclasses or class decorators.

In practice things get worse, because what you've looked at is a single point in time of your codebase; you only know that it's safe today. Nothing fundamentally prevents someone coming along later to helpfully add some rich comparison operators to a class that invalidate your assumptions that lst.sort() will be an atomic operation as far as the GIL is concerned and so you don't need to do any locking around it.

(If everyone involved is lucky, the class comments document that other code depends on the class not having rich comparison operators or whatever else you need it to not have.)

So what I've wound up feeling is that GIL safety is generally too complicated and tricky to use. Or perhaps it would be better to say that it's too fragile, since there are a vast number of ways to accidentally destroy it without realizing what you've done. If you actively want to use GIL safety to avoid explicit locking, it's probably going to be one of the trickier portions of your code and you should be very careful to document everything about it and to keep it as simple as possible (for example, using only primitive C-level types even if this requires contortions).

It's unfortunate that the GIL is this way, but it is (at least for now in CPython and thus probably for the future).

(In theory CPython could be augmented so that operations like lst.sort() explicitly held the GIL for their entire duration so that people wouldn't get surprised this way. But I suspect that the CPython developers want people to use explicit locking, mutexes, and so on, and not rely on hard to explain GIL guarantees that constrain their implementation choices.)

Written on 22 May 2017.
« We use jQuery and I've stopped feeling ashamed about it
Exploiting Python's Global Interpreter Lock for atomic operations is fun »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon May 22 21:28:07 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.