Nailing down new-style classes and types in Python

April 21, 2011

Since I keep confusing myself, it's time to write this stuff down once and for all to make sure I have it straight (even if some or all of it is in the official documentation).

One writes Python code to define classes; it's right there in the language syntax, where you write 'class A(object): ...'. Defining a class creates a type object for that class, which is an instance of type; this C-level object holds necessary information about the class and how it's actually implemented. This type object is what is bound to the class name; if you define a class A, 'type(A)' will then report <type 'type'>.

Classes have a class inheritance hierarchy, which is ultimately rooted at object (including for C-level classes). However, strictly speaking there is no type hierarchy as far as I know; all types are simply instances of type (including type itself). Further, the type non-hierarchy is of course unrelated to the class hierarchy. This means that isinstance(A, type) is True but issubclass(A, type) is both False and the wrong question (unless you really do have a subclass of type somewhere in your code).

(Among other things I believe that this means that 'type(type(obj))' is always 'type' for any arbitrary Python object, since all objects have a type and all types are instances of type.)

The Python documentation sometimes talks about a 'type hierarchy'. What it means is either 'the conceptual hierarchy of various built-in types', such as the various forms of numbers, mutable sequences, and so on, or 'the class inheritance hierarchy of built-in types', since a few are subclasses of others and everyone is a subclass of object.

(Some languages really do have a hierarchy of all types, with real (abstract) types for things like 'all numeric types' or 'all mutable sequence types', but Python does not. You can see this by inspecting the __mro__ attribute on built in types to see the classes involved in their method resolution order; the MRO of a type like int is just itself and object. Only a few built in types are subclasses of other types.)

PS: yes, almost all of this is in the Python documentation or is implied by it. Writing it down anyways helps me get it straight in my own head.

PPS: I believe that technically it would be possible for a sufficiently perverse extension module to create a valid new style C-level class that was not a subclass of object. Don't do that, and if you did I expect that things would blow up sooner or later.

Sidebar: the real difference between classes and types

If you use repr() on user-defined classes and on built in types (eg 'repr(A)' and 'repr(str)'), you'll notice that it reports them differently. This is a bit odd once you think about it, since they are both instances of type and so are using the same repr() function, yet one reports it is a 'class' and the other reports it is a 'type'.

In CPython, the difference between the two is whether the C-level type instance structure is flagged as having been allocated on the heap or not. A heap-allocated type instance is a class as far as type.__repr__() is concerned; a statically allocated one is a type. All classes defined in Python are allocated on the heap, like all other Python-created objects, and so report as classes. Most 'types' defined in C-level extension modules are statically defined and so get called types, but I believe that with sufficient work you could create a C-level type that had a heap allocated type instance and was reported as a class.

(It's easy enough to keep it from being garbage collected out from underneath your extension module; you just artificially increase its reference count.)

Written on 21 April 2011.
« How CPython implements __slots__ (part 1): storage
The Upstart dependency problem »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Apr 21 23:07:07 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.