== Python's philosophy of what closures close over A commentator on [[my last entry on closures CPythonCellsClosures]] said: > I don't not fully understand why closures could not capture bindings > upon being defined, though: [...] > > As your first example shows, this is what many people intuitively > expect, it seems. So, do you know why this behavior was not chosen? I think the real answer is 'because how Python does it is how languages usually do it'. Let me elaborate on that. When you add closures to a language, you are faced with a design choice (or a philosophical question). Consider the following code: > def dog(a): > def cat(b): > return a + b > return cat Here _a_ is a 'free variable' in _cat_, and when the closure is created it must be bound to something. The question is what it binds to: does it bind to the *variable* _a_, or does it bind to the *value* of _a_ at some point in time? (Given a suitable choice of when a closure is created, I don't think that there's a right or a wrong answer here, just different ones.) Like most languages with closures, Python has picked the first choice; free variables are bound to variables, not to values. There are pragmatic reasons for choosing this, including that it is most compatible with how global variables are treated in ordinary functions, but ultimately it is a choice. I don't think there's anything in Python's other semantics that require it. Making this choice creates a second one, which is what a variable's 'lifetime' is; when does _a_ stop being _a_ and become another variable (even one with the same name)? This is not quite the same as the scope of _a_, which is conventionally more or less when the name _a_ is defined. To see the difference, consider the following pseudo-code: > nums := []; > for (var j, i = 0; i < 100; i++) { > j = derivefrom(i) > nums.append(lambda x: j+x); > } In this code the scope of _j_ is the _for_ loop and the name is undefined outside of it. However we have two choices for the lifetime of _j_. We could say that _j_ is the same variable for every pass through the for loop body or we could say that each separate pass through the loop body creates a logically separate version of _j_ (which we could call _j{{ST:sub:0}}_, _j{{ST:sub:1}}_, and so on). In the latter case each lambda binds to a separate version of _j_, the version that was live during its loop. In the former case all lambdas bind to a single version (and the version has whatever value came back from '_derivefrom(99)_'). This example probably seems artificial. Now consider the same question recast as: > _nums = (lambda x: x+i for i in range(0,100))_ The scope of _i_ is the inside of the generator expression and it is undefined outside of it, but does every separate invocation of the generator's body use the same _i_ or does each invocation get a logically separate _i_? Again this is a design choice and I think that you could answer either way. Python chooses to say 'every invocation of the generator's body uses the same _i_'. What happens to the lambdas then follows logically. (If nothing else, this makes the interpreter's implementation simpler.) PS: with very careful use of scopes you can make what I'm calling variable lifetime here into scope lifetime, even with each loop body getting its own version of _j_. However, it requires a somewhat convoluted definition of how loops and loop index variables work. (And that is well beyond the scope of this entry.) === Sidebar: a pragmatic problem with 'bind to value' in Python The problem is that [[_def_ is an executable statement FunctionDefinitionOrder]]. This means that the closure is actually created in _dog_ not when the line '_return cat_' is executed but when '_def cat ...'_ occurs in the source. You can see this directly by disassembling the bytecode in a version of _dog_ where there's a statement between the definition of _cat_ and the _return_. This means that if closures bound to values instead of variables you might well have to move inner function definitions down in your function so that they were intermixed with actual code, because you would need to insure that they were only defined (and the closure created) after all of the variables they needed had been set up. And (as I saw mentioned somewhere) you couldn't have mutually recursive closures without awkward hacks.