How CPython implements __slots__ (part 2): access

April 28, 2011

In the first part I covered how the __slots__ instance attributes were stored (in an ad-hoc array of pointers that is glued on the end of instance objects), but that's only half of the puzzle. The other half is letting people access them, and that's what this entry is about.

As it happens, it's reasonably common in CPython for C-level types to want to give people Python-level access to fields in C structures, common enough that the CPython infrastructure for new-style types built an entire set of machinery to support it. To use this machinery you specify the names, (C-level) types, offsets, and optionally some flags like 'read only', for all of the fields that you want people to see and the type-registration process takes care of all of the rest for you. One example of where this is used is the __dictoffset__ attribute of types, which directly reads the tp_dictoffset field in the C-level type structure.

(I personally find this to be a very cool feature because it avoids a whole lot of potential code duplication in C-level stuff and makes it very easy to expose interesting fields.)

The Python-level manifestation of this machinery is member_descriptor objects. Like many things in CPython 2.x, they're complicated by some backwards compatibility issues with the older C API for CPython extension modules, so I am going to pass over the fine details and just say that attempts to read or write these fields winds up in a big switch statement that checks the type of the field and does the right magic to read or change it. There are all sorts of field types supported by this machinery; one of them is T_OBJECT_EX, where the field is a pointer to a Python object and an exception is raised if the pointer is NULL.

You know, that sounds a lot like how __slots__ attributes behave. That is not a coincidence.

How __slots__ attributes are accessed is that during the process of registering the type that is the new slot-using class, the core type machinery dynamically creates appropriate name and offset information for each attribute slot and registers them exactly as if they were conventional fields in the basic C structure of the type. This name and offset information is glued on the end of the basic type C-level structure, which is why type has a non-zero __itemsize__; each 'item' is a field member registration structure.

(As a trivia note, I believe that this makes type the only standard type where using __slots__ in a subclass has absolutely no effect. Since it has a non-zero itemsize you can't use a non-empty __slots__, and since it has a non-zero dictoffset an empty __slots__ will not stop your subclass from having a __dict__.)

PS: all of this is for Python 2. I haven't looked at the CPython source for Python 3.x.

Sidebar: where to find this in the CPython source

type itself is in Objects/typeobject.c. Member descriptors are one of the classes of descriptors in Objects/descrobject.c. Actually reading and writing structure members is done by code in Python/structmember.c.

The member list of a type is pointed to by its tp_members field; this is the basic raw form, not member descriptors. Member descriptors are not created immediately on type registration; instead this, and a lot of other type setup, is deferred until PyType_Ready() is called, which is done any time the code needs to look at various type information like, say, the attribute dictionary.

Written on 28 April 2011.
« Mail rejection stats for our external mail gateway
Understanding the iSCSI protocol for performance tuning »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Apr 28 23:40:37 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.