== Understanding a tricky bit of Python generators From [[this Python quiz http://web.mit.edu/rwbarton/www/python.html]]'s third question, consider the following code: > units = [1, 2] > tens = [10, 20] > nums = (a + b for a in units for b in tens) > units = [3, 4] > tens = [30, 40] > print nums.next() This is a far more interesting quiz question than you might think, because what's going on is actually quite deep. One of the things that gets people in Python is that it is in general what [[I've called LateBindingSuper]] a 'late binding' language; when you write an expression whose execution is deferred, the values of the variables it uses are not immediately captured. Instead they will be looked up when the code is actually executed. This shows up in [[the quiz http://web.mit.edu/rwbarton/www/python.html]]'s first question, for example. A straightforward interpretation of late binding might expect the code here to print 33. Instead it prints 31; _tens_ is late binding, referring to the current value, but _units_ has been bound immediately. You may now be going 'say what?' So let me make your day a little bit more surreal: > units = [1, 2] > tens = [10, 20] > nums = (a + b for a in units for b in tens) > units.pop(0) > tens = [30, 40] > print nums.next() This prints _32_. To explain this, let me quote from [[the language specification http://docs.python.org/reference/expressions.html#grammar-token-generator_expression]]: > Variables used in the generator expression are evaluated lazily when > the _``__next__()''_ method is called for generator object (in the > same fashion as normal generators). However, ~~the leftmost for clause > is immediately evaluated~~, [...] However, 'evaluates' here does not mean what you might think. When Python 'evaluates' _units_ in the '_for a in units_' clause, it doesn't make a private copy of the list's value; instead, it creates an iterator object from the list. This iterator object is what the for loop actually loops over, and it internally has a reference to the original list that _units_ was bound to. The first version of this question rebinds _units_ but leaves the original list (now accessible only through the iterator) unaltered. _nums.next()_ thus uses the first element of the original list as _a_. The second version mutates the original list by deleting the first element, so _nums.next()_ winds up using '2' as _a_. (In both cases the second _for_ loop is un-evaluated until the generator begins operation, so it picks up the new binding for _tens_.) I am in admiration of how deep this rabbit hole turned out to be once I actually started looking down it. === Sidebar: seeing this in the bytecode To confirm what's happening, let's disassemble ((nums.gi_frame.f_code)), the actual bytecode of the generator (I've somewhat simplified the bytecode disassembly syntax): > 0 SETUP_LOOP (to 42) > 3 LOAD_FAST '.0' > 6 FOR_ITER (to 41) > 9 STORE_FAST 'a' > 12 SETUP_LOOP (to 38) > 15 LOAD_GLOBAL 'tens' > 18 GET_ITER > 19 FOR_ITER (to 37) > [...] As we can see here, there is no reference to _units_; instead we do a load of a local variable called _.0_. If we check ((nums.gi_frame.f_locals['.0'])), we'll see that this local variable is a 'listiterator' object. By contrast, _tens_ is loaded explicitly as a global and then immediately turned into an iterator so that it can be looped over. Note that you can get truly odd behavior by rebinding _tens_ after you've called _nums.next()_ once (or otherwise after you've invoked the generator once). This is because every time we go through the outer loop, the current binding of _tens_ is re-captured into an iterator for the inner loop. Further rebinding of _tens_ has a delayed effect; it takes effect on the next pass around the outer loop. (Mutation of the current binding has an immediate effect, but is then lost if you've rebound _tens_ as well.) PS: you can make similar crazy things happen in conventional _for_ loops, because the same object to iterator transformation is happening.