2024-06-16
Understanding a Python closure oddity
Recently, Glyph pointed out a Python oddity on the Fediverse and I had to stare at it for a bit to understand what was going on, partly because my mind is partly thinking in Go these days, and Go has a different issue in similar code. So let's start with the code:
def loop(): for number in range(10): def closure(): return number yield closure eagerly = [each() for each in loop()] lazily = [each() for each in list(loop())]
The oddity is that 'eagerly
' and 'lazily
' wind up different,
and why.
The first thing that is going on in this Python code is that while
'number
' is only used in the for
loop, it is an ordinary function
local variable. We could set it before the loop and look at it after
the loop if we wanted to, and if we did, it would be '9' at the end
of the for
loop. The consequence and the corollary is that every
closure returned in the 'for
' loop is using the same 'number
'
local variable.
(In some languages and in some circumstances, each closure would
close over a different instance of 'number
'; see for example
this Go 1.22 change.)
Since all of the closures are using the same 'number
' local
variable, what matters for what value they return is when they are
called. When you call any of them, it will return the value of
'number
' that is in effect in the 'loop
' function as of that
moment. And if you call any of them after the 'loop
' function has
finished, 'number
' has the value of '9'.
This also means that if you call a single 'each' function more than once, the value it returns can be different. For example:
>>> g = loop() >>> each0 = g.__next__() >>> each0() 0 >>> each1 = g.__next__() >>> each0() 1
(What the 'loop()
' call actually returns is a generator.
I'm directly calling its magic method to be explicit, rather
than using the more general next()
.)
And in a way this is the difference between 'eagerly
' and 'lazily
'.
For 'eagerly
', the list comprehension
iterates through the results of 'loop()
' and immediately calls
each version of 'each
' that it obtains, which gets the value of
'number
' that is in effect right then. For 'lazily
', the
'list(loop())
' first collects all of the 'each
' closures, which
ends the 'for
' loop in the 'loop
' function and means 'number
'
is now '9', and then calls all of the 'each
' closures, which all
return the final value of 'number
'.
The 'eagerly' and 'lazily' names may be a bit confusing (they were
to me). What they refer to is whether we eagerly or lazily call
each closure as it is returned by 'loop()
'. In 'eagerly', we call
the closures immediately; in 'lazily', we call them only later,
after the 'for
' loop is done and 'number
' has taken on its final
value. As Glyph said on the Fediverse, there is another
level of eagerness or laziness, which is how aggressively we iterate
the generator from 'loop()
', and this is actually backward from the
names; in 'eagerly' we lazily iterate the generator, while in 'lazily'
we eagerly iterate the generator (that's what the 'list()
' does).
(I'm writing this entry partly for myself, because someday I may run into an issue like this in my own Python code. If you only use a generator with code patterns like the 'eagerly' case, an issue like this could lurk undetected for some time.)