Reference counting and multiple inheritance in (C)Python
July 31, 2011
I recently stumbled over this comment on a LWN article about object oriented design patterns in the Linux kernel. Quoting a bit from the original article, Auders asked:
There are two answers. (I'm going to try to write this from the perspective of a C programmer.)
What causes this inheritance problem in C is because of how most C code handles data inheritance. As covered in the LWN series, typically you do inheritance by direct structure embedding; to inherit from struct A, you put struct A in your own struct. This means that your struct's lifetime is tied to the lifetime of the embedded struct A; when A's reference count goes to zero, you will be told to delete your entire structure. If you embed both struct A and struct B, each of them separately reference counted, then A can have its reference count go to zero before B and you will be told to delete your entire structure even though the embedded B is still alive and cannot be deleted.
Python does not do data inheritance by directly embedding structures. Instead each object has a single storage for all fields, regardless of where they come from, and all classes that you inherit from write their fields into it. This is part of what enables Python objects to have a single reference count which is manipulated by everything that takes or releases a reference to the object, regardless of which class's code and data it is working through. This works at the C level because all CPython objects start with a common structure that can be manipulated generically without needing to know what sort of object you're dealing with.
(You could do a single object-wide reference count in C if you wanted to, but it requires extra overhead and only makes sense if your struct A and struct B are purely virtual and must always be subclassed. Instead of directly embedding a reference count in each structure, you'd embed a pointer to the overall object's reference count (and set this in each embedded struct when creating your overall object). You also need to think about whether it does damage to have a dangling struct A that is not referenced from anywhere, because this is what happens when you drop the last reference to A before the overall object can be deleted.)
The other answer is that CPython also has this limit on multiple
inheritance but it's more carefully disguised. Because CPython cannot
do (C) structure embedding, it simply refuses to let a Python class
inherit from two different C-level classes, or in fact from two classes
(Python or C-level) that have incompatible object layouts at the C
level. Typical Python programs never notice because almost all (Python)
classes only inherit from a single C-level class, that being
(C-level classes effectively do not inherit from each other.)
Attempting to break this constraint gets you a series of odd error messages depending on what exactly you're trying to do. I've written about various specific manifestations of this before, such as here and here.
(Both answers are true, but the first answer is incomplete.)
Written on 31 July 2011.
* * *