2009-09-22
Some trivia about Python frame objects
Since I've been poking around in this area of CPython lately, here's some trivia associated with frame objects.
First, one might wonder if code executing at the module level literally
has a CPython frame struct with f_locals being the same as
f_globals, or if the C code just leaves f_locals null and fixes
things up behind the scenes when you look from Python code. The
answer turns out to be that CPython frame structures always have
a real f_locals (Python) dictionary, and in module level code
it is just another reference to the globals as you'd expect.
This surprised me because I had always assumed that f_locals
dictionaries for function-level frames were only materialized on
(rare) demand. Instead there's always a dictionary, although its
contents are allowed to get out of date with the real local variables as long as no Python code is looking at it.
(There are various interesting efficiency hacks to make creation and destruction of frame objects faster than you might expect.)
Second and probably obviously: there's no way to directly update the
builtins namespace. This is both a language issue (there's no builtin
analog to the global keyword) and an issue with the bytecode
interpreter; in order to support direct updates
of the builtin namespace, the bytecode interpreter would need a new
STORE_BUILTIN opcode, as STORE_GLOBAL naturally only updates the
f_globals namespace.
Now, consider the following code:
eval = eval
def t():
global file
file = file
This peculiar do-nothing code actually does do something: it creates
module-level shadows of the eval and file builtins. This is one of
the least obvious cases in which the left hand side of an assignment can
be in a different namespace than the right hand side despite using the
exact same variable names. (Or not, if you've already executed these
statements once.)
(Previously I said that there were only two cases of such namespace differences; I was wrong then, or at least insufficiently perverse.)
2009-09-21
Exploring the frame object f_builtins member
As I noted in passing, Python frame objects also
have a vaguely mysterious f_builtins member. On one level, frame
objects have this member because they are more or less representations
of the CPython (code) frame structure, and the C-level code frame
structure has an f_builtins field. So, what is this field?
(Quite a lot of the Python internal objects work this way; they have the members that they do mostly because they're Python representations of C structures).
We can say that Python searches for names in three namespaces, those
being the function locals, the (module) globals, and finally the
builtins. The f_builtins field points to the dictionary of this
frame's builtin namespace. Normally the builtins namespace is the same
as the __builtins__ module's namespace, but it doesn't have to be;
you can manipulate it under certain circumstances.
(It is technically inaccurate to say that CPython searchs for names in three namespaces, because CPython actually knows in advance whether a particular name is a function local variable or not.)
The directly accessible ways are to use eval() or exec and specify
a 'globals' dictionary with a __builtins__ member. If present,
this becomes the builtins for the code, shows up in f_builtins
in frames, and so on. Any code frame with a non-standard value for
f_builtins is a 'restricted' frame, and various bits of the CPython
innards behave differently (usually they forbid various operations,
for example setting attributes on classes). In turn all of this
seems to be present to support the now-deprecated rexec.py module, which attempts to (you
guessed it) restrict what some untrusted Python code can do.
Under extremely odd situations (I think you'd need to write a CPython
module in C), you can create a frame with a globals dictionary that does
not have a __builtins__ member. If this happens, CPython makes up a
very small builtins namespace for the new frame; currently it contains
only None, but this is probably considered implementation dependent.