The theoretical legality of shadowing builtins in Python
Here is a variant of an example I wrote a few years ago:
eval = eval class A(object): file = file
This creates a module-level
eval name binding that is the same as the
eval, and a class variable
A.file that is the same as the
file. All of this works in CPython because of how names and
scopes are used in the CPython bytecode.
Which leads to the theoretical question: is this actually 'portable' in
the sense that this behavior is required by the Python specification?
(As a practical matter I think that any alternate Python interpreter will include this behavior, making it portable in practice; I believe that a certain amount of code out there in the world relies on it.)
I will cut to the chase: the real result of this exercise is that the Python language reference is essentially an informal document, not a standard. You can't use it for language lawyering, not only for the pragmatic reasons mentioned above but also because it's not an attempt at a complete formal specification of Python for implementors; it is more an attempt at some sort of semantic description for Python programmers (combined with a grammar). The rest of this entry is an illustration of that.
The place to look for the answer to our question is the Naming and Binding section of the Language Reference (Python 3 version). Having peered into the Python 2 version, as far as I can tell this behavior is ambiguous for module level code but apparently theoretically not correct in class level code. For class level code, the crucial two sentences are:
Each assignment or import statement occurs within a block defined by a class or function definition or at the module level (the top-level code block).
If a name binding operation occurs anywhere within a code block, all uses of the name within the block are treated as references to the current block. [...]
The second sentence is only correct in CPython for function code
blocks; it's false for other blocks, as we can see in the example with
class A. The case of module-level code is more ambiguous, because the
same section contains a description of the
global statement which
[...] Names are resolved in the top-level namespace by searching the global namespace, i.e. the namespace of the module containing the code block, and the builtins namespace, [...]
Although this is in a paragraph about
global, it's sensible to read
it as a general description of how names are resolved in the top-level
(module) namespace. One reading of this combined with the 'name binding'
sentence allows for module-level rebinding; in '
eval = eval', the
eval may be a reference to the version in the module
level block scope but the lookup rules for such references allow you
to find the builtin
eval. Another reading is that the two sentences
contradict each other.
By the way, this shows one of the problems with standards in practice: you have to read most actual standards for complex things extremely closely and carefully in order to get the right answers. Doing this is unnatural and hard, even more so than reading Unix manpages; mistakes are easy to make and the consequences potentially significant (and hard to test).
PS: given this view of the language reference, you might wonder why I want it to include a description of the attribute lookup order. My answer is that such a description is useful for a Python programmer, if only to put all of the pieces in one place. By contrast painstaking and nitpicking descriptions of arcane bits of namespace oddness are not so useful.