== Explaining a piece of deep weirdness with Python's _exec_ In an update in [[yesterday's entry on scopes and what bytecodes they use ScopesAndOpcodes]] I said that the special ``*_NAME'' opcodes could wind up being used in functions but that it would take an entire entry to explain when and why, and I also included a trivia bonus that I need to explain. It will probably not surprise you to know that these two things turn out to be intimately connected. To start with, here is an altered version of [[yesterday's bonus trivia ScopesAndOpcodes]] that adds a second oddity: > def geta(): > return a > > def exec_and_return(codeobj, x): > exec codeobj > return x, geta(), a > > a = 5 > co = compile("a = a + x; x = a", '', 'exec') > > print exec_and_return(co, 4) > print a This prints '_(9, 5, 9)_' and then '_5_'. > ~~Wait, what?~~ Before I start trying to explain this, let's talk about all of the things that are wrong with this result. First off, if the _compile()_ code was simply inline in place of the _exec_ we would get an UnboundLocalError since _a_ is used before it's assigned to. Second, if the _a_ was instead being treated as just a plain global its global value should change and we can see that it doesn't, either while the function is running or after it finishes; at the same time, the global value of _a_ was clearly used to compute the result. Finally, something that looks a lot like a new local _a_ is visible in the function with the value that we expect (the same value as _x_). (You can also verify that _a_ appears in the dictionary returned by _locals()_.) The first thing going on here is what happens inside the compiled code as _exec_ runs it. Recall that NAME opcodes essentially treat a variable as global until it is assigned to, at which point it becomes a local, and that _compile()_ generates NAME opcodes. This means that in the compiled _co_ code object, the value of _a_ is first read from the global _a_ but then the assignment creates a local _a_ (in the stack frame that _exec_ uses to run the code); it is this local _a_ that is then assigned to _x_. This explains why the global _a_ never changes value; it is never written to, despite appearances. What the compiled code really means is something like '_al = a + x; x = al_'. (This is the entire explanation for yesterday's trivia contest.) The second thing going on is that ~~CPython tries quite hard to let _exec_'d code create new local variables in functions~~. It does this by changing what bytecodes get generated for references to variables that aren't definitely locals. Under normal circumstances CPython decides that you clearly mean a global variable and compiles to bytecodes that use the ``*_GLOBAL'' family of opcodes to access things. However, if you use _exec_ in your function CPython decides that such references are unclear and compiles them to ``*_NAME'' opcodes instead, since you could also be trying to access new local variables that will be created inside the code run by _exec_. (Since NAME opcodes look first at locals and then secondly at globals, this will still work if you are genuinely referring to a global variable. The example contains an instance of this in our call to _geta()_, as you can see by [[disassembling the bytecode http://docs.python.org/library/dis.html]] for ((exec_and_return)).) But just compiling such references to NAME opcodes isn't enough enough to do the job by itself. Because NAME opcodes explicitly look at the frame's ((f_locals)) dictionary, you need a real dictionary and it really holds (some) local variables in it. This means that ~~a function that uses _exec_ effectively has two sets of local variables~~; it has the known [[fast local variables WhyLocalVarsAreFast]], stored in the local variable array and accessed with FAST opcodes, and then also any additional variables that were materialized by _exec_'d code, stored in the frame locals dictionary and accessed with NAME opcodes (if they're accessed at all). (Since I was just checking this: while _exec_ does wind up creating a new frame to run the compiled code in, as far as I can tell it does not create a new frame locals dictionary to go with it. Instead it directly reuses the frame locals dictionary of the current frame. The [[fast local variables]] are synchronized into the frame locals dictionary before it's used and _exec_ [[imperfectly ExecScopeHandlingBug]] copies changes to them back to the local variables array after the code finishes running.) Frankly, all of this is confusing and arcane and just goes to show how much of an impact on your language there is to try to support executing arbitrary code in the scope of a function (at least in the face of various sorts of important optimizations). We are up to one bug, conditional bytecode generation with unusual semantics, and several pieces of odd weirdness.