2017-05-22
Safely using Python's Global Interpreter Lock is quite tricky and subtle
Recently I ran across Grok the GIL: How to write fast and thread-safe Python (via), by A. Jesse Jiryu Davis. In the original version of the article, A. Jesse wrote:
Even though the line
lst.sort()
takes several steps, thesort
call itself is a single bytecode, and thus there is no opportunity for the thread to have the GIL seized from it during the call. We could conclude that we don't need to lock aroundsort()
.
In comments on that article, Ben Darnell pointed out that this is not necessarily true. The fundamental issue is that Python objects are allowed to customize how they are compared for sorting, hashed for things like dictionary insertions, and so on. These operations necessarily require running Python code, and the moment you start running Python code the GIL may be seized from you.
The version of the article on A. Jesse's personal site has been updated to note some qualifications:
[...] so long as we haven’t provided a Python callback for the
key
parameter, or added any objects with custom__cmp__
methods. [...]
If you guessed that these qualifications are not quite complete,
you would be correct; they're missing the rich comparison operations,
which may be defined instead of __cmp__
(and in Python 3, they're
the only way to customize comparisons and ordering).
This may seem like nit-picking and in a sense it is. But what I
really want to illustrate here is just how tricky it is to safely
use the GIL without locking. In CPython there is a huge number of
special cases where Python code can start running in what appear
to be simple atomic C-level operations like lst.sort()
. Knowing
these special cases requires a great deal of up to date Python
knowledge (so you know all the various ways things can be customized),
then finding out if they're going to affect any particular operation
you want to do may require deep dives into the source code you're
dealing with. This is especially the case if the code uses things
like metaclasses or class decorators.
In practice things get worse, because what you've looked at is a
single point in time of your codebase; you only know that it's safe
today. Nothing fundamentally prevents someone coming along later
to helpfully add some rich comparison operators to a class that
invalidate your assumptions that lst.sort()
will be an atomic
operation as far as the GIL is concerned and so you don't need to
do any locking around it.
(If everyone involved is lucky, the class comments document that other code depends on the class not having rich comparison operators or whatever else you need it to not have.)
So what I've wound up feeling is that GIL safety is generally too complicated and tricky to use. Or perhaps it would be better to say that it's too fragile, since there are a vast number of ways to accidentally destroy it without realizing what you've done. If you actively want to use GIL safety to avoid explicit locking, it's probably going to be one of the trickier portions of your code and you should be very careful to document everything about it and to keep it as simple as possible (for example, using only primitive C-level types even if this requires contortions).
It's unfortunate that the GIL is this way, but it is (at least for now in CPython and thus probably for the future).
(In theory CPython could be augmented so that operations like
lst.sort()
explicitly held the GIL for their entire duration so
that people wouldn't get surprised this way. But I suspect that the
CPython developers want people to use explicit locking, mutexes,
and so on, and not rely on hard to explain GIL guarantees that
constrain their implementation choices.)