2007-06-27
Why you can't use object.__new__ on everything
Here's an interesting error message, somewhat slimmed down:
>>> object.__new__(dict)
TypeError: object.__new__(dict) is not safe, use dict.__new__()
At first blush this seems a peculiar artificial limitation: object is
the root of the Python type hierarchy and thus its __new__ is the
root version of creating new objects, so why doesn't it work?
Like the last peculiarity this is ultimately due
to how CPython is implemented. We can get an idea of the real problem
by knowing the exact limitation, which is that the C-level type whose
__new__ is being called must be the first C-level type in the
inheritance tree.
What is going on is that each C-level type creates its own objects
itself, because the core of an object is an opaque type-specific blob of
data that can only be set up by code that knows what's supposed to be
in it. So the C code for object.__new__ can only create things that are
object instances at their heart, which dict instances are not.
(The actual code could make things work by calling the C code for dict.__new__ instead, but that's not what you asked Python to do so it declines to be clever.)
The same issue is behind the 'multiple bases have instance lay-out conflict' error you get if you try to make a Python class that inherits from two C-level types. Because an object can only have one blob of of that type-specific data, you can only inherit from a single C-level type.
2007-06-20
More on slot wrapper objects
Following up on the first installment:
Slot wrapper objects do not directly call the C functions that they
wrap up. Instead they check that they are being called with a self
argument that is an instance of the type that they are wrapping,
create a 'method-wrapper' object with the self memorized, and
call that.
We're not done yet, because there is another level of indirection: instead of directly calling the C function, the method-wrapper object calls a generic handler function for the particular sort of slot function that you are calling, and that generic handler calls the actual function. The generic handler handles all of the Python-level bookkeeping, because the slot functions themselves are only expecting to be called inside the guts of the interpreter.
(If you want to poke at a method-wrapper object, many of the attributes of a slot wrapper object are method-wrapper objects.)
The C function that corresponds to __new__ must be handled specially
because __new__ is not called with an instance as the self
argument. This C function is just turned into a generic wrapped method
(which is a complicated subject in its own right) that gets the type
itself as self when it gets called.
(Just to confuse everyone, a 'built-in method' object is not the same as a 'method-wrapper' object, although they do more or less the same thing.)
People who want to see how this particular sausage is made can look
into Objects/typeobject.c and Objects/descrobject.c in the CPython
source. Start at the add_operators function.
2007-06-18
What are slot wrapper objects?
I got curious about what 'slot wrapper' objects are after I encountered them in FindingMethodProviderII, so I went digging in the CPython source. It turns out that the answer requires understanding the internal structure of CPython.
Python types written in C start with a big structure that defines
various details about them, including a large number of pointers to
functions that implement various basic operations such as deallocating
objects of this type, getting and setting attributes, comparison, and
so on. Many but not all of these correspond to Python level canonical
methods like __repr__; however, when CPython actually calls these
core functions it doesn't go through a Python-level method lookup
and instead makes a direct call to the C function pointed to by, eg,
type->tp_repr.
As a service, the CPython core that registers a new type automatically
creates a __dict__ object and populates it with names for all of the
function slot pointers that correspond to the canonical methods, so that
you can actually do things like get and invoke the __init__ function
of a built in type. These names can't point directly to the C functions;
instead they point to (you guessed it) special 'slot wrapper' objects,
which encapsulate all the information necessary to actually call the C
functions from Python.
The one exception to this is __new__, which is done differently for
reasons that I'll cover in a future entry.