Python's philosophy of what closures close over
A commentator on my last entry on closures said:
I don't not fully understand why closures could not capture bindings upon being defined, though: [...]
As your first example shows, this is what many people intuitively expect, it seems. So, do you know why this behavior was not chosen?
I think the real answer is 'because how Python does it is how languages usually do it'. Let me elaborate on that.
When you add closures to a language, you are faced with a design choice (or a philosophical question). Consider the following code:
def dog(a): def cat(b): return a + b return cat
Here a
is a 'free variable' in cat
, and when the closure is created
it must be bound to something. The question is what it binds to: does
it bind to the variable a
, or does it bind to the value of a
at
some point in time?
(Given a suitable choice of when a closure is created, I don't think that there's a right or a wrong answer here, just different ones.)
Like most languages with closures, Python has picked the first choice; free variables are bound to variables, not to values. There are pragmatic reasons for choosing this, including that it is most compatible with how global variables are treated in ordinary functions, but ultimately it is a choice. I don't think there's anything in Python's other semantics that require it.
Making this choice creates a second one, which is what a variable's
'lifetime' is; when does a
stop being a
and become another variable
(even one with the same name)? This is not quite the same as the scope
of a
, which is conventionally more or less when the name a
is
defined. To see the difference, consider the following pseudo-code:
nums := []; for (var j, i = 0; i < 100; i++) { j = derivefrom(i) nums.append(lambda x: j+x); }
In this code the scope of j
is the for
loop and the name is
undefined outside of it. However we have two choices for the lifetime of
j
. We could say that j
is the same variable for every pass through
the for loop body or we could say that each separate pass through
the loop body creates a logically separate version of j
(which we
could call j0
, j1
, and so on). In the latter
case each lambda binds to a separate version of j
, the version that
was live during its loop. In the former case all lambdas bind to a
single version (and the version has whatever value came back from
'derivefrom(99)
').
This example probably seems artificial. Now consider the same question recast as:
nums = (lambda x: x+i for i in range(0,100))
The scope of i
is the inside of the generator expression and it
is undefined outside of it, but does every separate invocation of
the generator's body use the same i
or does each invocation get a
logically separate i
? Again this is a design choice and I think that
you could answer either way. Python chooses to say 'every invocation of
the generator's body uses the same i
'. What happens to the lambdas
then follows logically.
(If nothing else, this makes the interpreter's implementation simpler.)
PS: with very careful use of scopes you can make what I'm calling
variable lifetime here into scope lifetime, even with each loop body
getting its own version of j
. However, it requires a somewhat
convoluted definition of how loops and loop index variables work. (And
that is well beyond the scope of this entry.)
Sidebar: a pragmatic problem with 'bind to value' in Python
The problem is that def
is an executable statement. This means that the closure is actually
created in dog
not when the line 'return cat
' is executed but
when 'def cat ...'
occurs in the source. You can see this directly
by disassembling the bytecode in a version of dog
where there's a
statement between the definition of cat
and the return
.
This means that if closures bound to values instead of variables you might well have to move inner function definitions down in your function so that they were intermixed with actual code, because you would need to insure that they were only defined (and the closure created) after all of the variables they needed had been set up. And (as I saw mentioned somewhere) you couldn't have mutually recursive closures without awkward hacks.
|
|