Wandering Thoughts archives

2011-10-07

Python's philosophy of what closures close over

A commentator on my last entry on closures said:

I don't not fully understand why closures could not capture bindings upon being defined, though: [...]

As your first example shows, this is what many people intuitively expect, it seems. So, do you know why this behavior was not chosen?

I think the real answer is 'because how Python does it is how languages usually do it'. Let me elaborate on that.

When you add closures to a language, you are faced with a design choice (or a philosophical question). Consider the following code:

def dog(a):
  def cat(b):
    return a + b
  return cat

Here a is a 'free variable' in cat, and when the closure is created it must be bound to something. The question is what it binds to: does it bind to the variable a, or does it bind to the value of a at some point in time?

(Given a suitable choice of when a closure is created, I don't think that there's a right or a wrong answer here, just different ones.)

Like most languages with closures, Python has picked the first choice; free variables are bound to variables, not to values. There are pragmatic reasons for choosing this, including that it is most compatible with how global variables are treated in ordinary functions, but ultimately it is a choice. I don't think there's anything in Python's other semantics that require it.

Making this choice creates a second one, which is what a variable's 'lifetime' is; when does a stop being a and become another variable (even one with the same name)? This is not quite the same as the scope of a, which is conventionally more or less when the name a is defined. To see the difference, consider the following pseudo-code:

nums := [];
for (var j, i = 0; i < 100; i++) {
  j = derivefrom(i)
  nums.append(lambda x: j+x);
}

In this code the scope of j is the for loop and the name is undefined outside of it. However we have two choices for the lifetime of j. We could say that j is the same variable for every pass through the for loop body or we could say that each separate pass through the loop body creates a logically separate version of j (which we could call j0, j1, and so on). In the latter case each lambda binds to a separate version of j, the version that was live during its loop. In the former case all lambdas bind to a single version (and the version has whatever value came back from 'derivefrom(99)').

This example probably seems artificial. Now consider the same question recast as:

nums = (lambda x: x+i for i in range(0,100))

The scope of i is the inside of the generator expression and it is undefined outside of it, but does every separate invocation of the generator's body use the same i or does each invocation get a logically separate i? Again this is a design choice and I think that you could answer either way. Python chooses to say 'every invocation of the generator's body uses the same i'. What happens to the lambdas then follows logically.

(If nothing else, this makes the interpreter's implementation simpler.)

PS: with very careful use of scopes you can make what I'm calling variable lifetime here into scope lifetime, even with each loop body getting its own version of j. However, it requires a somewhat convoluted definition of how loops and loop index variables work. (And that is well beyond the scope of this entry.)

Sidebar: a pragmatic problem with 'bind to value' in Python

The problem is that def is an executable statement. This means that the closure is actually created in dog not when the line 'return cat' is executed but when 'def cat ...' occurs in the source. You can see this directly by disassembling the bytecode in a version of dog where there's a statement between the definition of cat and the return.

This means that if closures bound to values instead of variables you might well have to move inner function definitions down in your function so that they were intermixed with actual code, because you would need to insure that they were only defined (and the closure created) after all of the variables they needed had been set up. And (as I saw mentioned somewhere) you couldn't have mutually recursive closures without awkward hacks.

WhatClosuresClose written at 02:42:01; Add Comment

2011-10-03

On CPython, cell objects, and closures

A commentator on Understanding a tricky bit of Python generators noticed an odd looking thing with generator expressions:

adders = list(lambda x: x+i for i in range(0, 100))
print adders[17](42)

This prints '141'. On the one hand this is quite ordinary (it's just like other cases of lambdas), but on the other hand things start looking quite peculiar when you disassemble the generated bytecode for the lambda expressions and look at the nominal local variables. If we look at adders[0].__closure__[0] we see what is printed as something like <cell at 0xb720e1dc: int object at 0x97ced74>. The commentator noted:

So when the generator yields its second element, the second lambda function that is constructed, though it is a distinct object itself, uses the same closure cell as the first one does. [...]

I confess I don't understand why the CPython interpreter is doing this. [...] But instead, the same closure cell object is being used each time, though I haven't been able to find where, if anywhere, this closure cell object is stored inside the generator.

To understand what was going on I wound up reading the CPython source code and then thinking about the problem a bit. So let's backtrack to talk about a simpler example:

def mkadder(i):
  def adder(x):
    return x+i
  print i
  return adder

In a normal function, one that does not create a closure, i would be a local variable and so would be stored in the local variable array in the function's stack frame. Such local variables do not have an independent existence from the stack frame itself; when you set or retrieve their value, you are actually doing 'set local variable slot N' or 'get local variable slot N'. The local variable names are retained only for debugging purposes.

However this presents a problem when we create a closure; because the closure needs to access i, we need to keep i alive after the function has exited. If i was still stored in the stack frame we would have to keep the entire stack frame alive. Even if we dropped all local variables that were not referred to by the closure we'd still be hauling around an extra stack frame that we don't really need (or possibly several stack frames, because you can nest closure-creating functions).

So what CPython does is that it silently converts i from a local variable into an independent cell object. A cell object is nothing more than a holder for a reference to another object; it is essentially a variable without a name. Because the cell object has an independent existence from mkadder's stack frame it can be incorporated into the adder closure by itself (more specifically, it can be referred to by the closure).

You might ask why CPython goes through all of this extra work to add a layer of indirection to the closure's reference to i. After all, why not simply capture i's current binding? The answer is that this would be inconsistent with the treatment of global variables and it would introduce oddities even for local variables. For example, consider the following code:

def mkpadder(i, j):
  def adder(x):
    return x+i
  a = adder
  i = i + j
  return a

t = mkpadder(1, 10)
print t(1)

If closures immediately captured the binding of their closed over variables this would print 2, while a slight variant version that had 'return adder' instead would print 12. Or maybe both versions should return 2 because the closure would be created the moment that adder is defined in mkpadder instead of when it must be materialized (in either 'a = adder' or 'return adder').

So what is happening in the generator is that the loop variable i is a cell object instead of a local variable, because it is incorporated into the lambda closure. Every lambda closure gets a reference to the same cell object, and as the for loop runs the value of the cell object (ie, the object it points to) keeps changing until at the end it has the value 99. The repr() of cell objects contains both their actual address (which is constant) and some information about the object that they currently point to (which may change).

You can see the difference between cell objects and local variables in the bytecode. Cell variables use LOAD_DEREF and STORE_DEREF bytecodes; local variables use LOAD_FAST and STORE_FAST.

(Trivia: a long time ago I confidently asserted that the reason you couldn't write to closed-over variables in a closure was that the CPython bytecode had no operation to store things into such variables, although it could read from them. As we can see from the existence of STORE_DEREF, I was completely wrong. CPython simply just doesn't do it for its own reasons.)

Sidebar: why CPython doesn't use a real namespace for this

Given that cell objects are a lot of work essentially to fake a namespace, one plausible alternate approach is for CPython to create a real namespace for closed over variables instead. There are a couple of reasons to avoid this.

First, it wouldn't necessarily be one namespace. A single function can create multiple closures, each of which can use a different subset of the function's local variables. Plus you have the issue of closures that themselves create more closures that use a subset of the closed over variables.

Second, it would be less efficient than the current approach. Right now, pointers to the cell objects for any particular chunk of code are stored in an array (basically just like local variables) and are set and retrieved by index into the array, instead of by name lookups in a namespace dictionary. This makes referring to closed over variables only slightly slower than referring to local variables (it adds an extra dereference through the cell object). Changing to name lookups in a dictionary (as if they were global variables) would be considerably slower.

CPythonCellsClosures written at 22:11:25; Add Comment

2011-10-01

Understanding a tricky bit of Python generators

From this Python quiz's third question, consider the following code:

units = [1, 2]
tens = [10, 20]
nums = (a + b for a in units for b in tens)
units = [3, 4]
tens = [30, 40]
print nums.next()

This is a far more interesting quiz question than you might think, because what's going on is actually quite deep.

One of the things that gets people in Python is that it is in general what I've called a 'late binding' language; when you write an expression whose execution is deferred, the values of the variables it uses are not immediately captured. Instead they will be looked up when the code is actually executed. This shows up in the quiz's first question, for example. A straightforward interpretation of late binding might expect the code here to print 33. Instead it prints 31; tens is late binding, referring to the current value, but units has been bound immediately.

You may now be going 'say what?' So let me make your day a little bit more surreal:

units = [1, 2]
tens = [10, 20]
nums = (a + b for a in units for b in tens)
units.pop(0)
tens = [30, 40]
print nums.next()

This prints 32.

To explain this, let me quote from the language specification:

Variables used in the generator expression are evaluated lazily when the __next__() method is called for generator object (in the same fashion as normal generators). However, the leftmost for clause is immediately evaluated, [...]

However, 'evaluates' here does not mean what you might think. When Python 'evaluates' units in the 'for a in units' clause, it doesn't make a private copy of the list's value; instead, it creates an iterator object from the list. This iterator object is what the for loop actually loops over, and it internally has a reference to the original list that units was bound to. The first version of this question rebinds units but leaves the original list (now accessible only through the iterator) unaltered. nums.next() thus uses the first element of the original list as a. The second version mutates the original list by deleting the first element, so nums.next() winds up using '2' as a.

(In both cases the second for loop is un-evaluated until the generator begins operation, so it picks up the new binding for tens.)

I am in admiration of how deep this rabbit hole turned out to be once I actually started looking down it.

Sidebar: seeing this in the bytecode

To confirm what's happening, let's disassemble nums.gi_frame.f_code, the actual bytecode of the generator (I've somewhat simplified the bytecode disassembly syntax):

 0 SETUP_LOOP           (to 42)
 3 LOAD_FAST   '.0'
 6 FOR_ITER             (to 41)
 9 STORE_FAST  'a'
12 SETUP_LOOP           (to 38)
15 LOAD_GLOBAL 'tens'
18 GET_ITER
19 FOR_ITER             (to 37)
[...]

As we can see here, there is no reference to units; instead we do a load of a local variable called .0. If we check nums.gi_frame.f_locals['.0'], we'll see that this local variable is a 'listiterator' object. By contrast, tens is loaded explicitly as a global and then immediately turned into an iterator so that it can be looped over.

Note that you can get truly odd behavior by rebinding tens after you've called nums.next() once (or otherwise after you've invoked the generator once). This is because every time we go through the outer loop, the current binding of tens is re-captured into an iterator for the inner loop. Further rebinding of tens has a delayed effect; it takes effect on the next pass around the outer loop.

(Mutation of the current binding has an immediate effect, but is then lost if you've rebound tens as well.)

PS: you can make similar crazy things happen in conventional for loops, because the same object to iterator transformation is happening.

TrickyPythonBinding written at 01:30:19; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.