An alterate pattern for polymorphism in C

January 3, 2013

As I mentioned in yesterday's entry, CPython (the C-based main implementation of Python) uses an interesting variant on struct-at-start based polymorphism. To put it simply, it uses #defines instead of a struct. This probably sounds odd, so let me show you the slightly simplified CPython 2.7.x code:

#define PyObject_HEAD           \
   Py_ssize_t ob_refcnt;        \
   struct _typeobject *ob_type;

#define PyObject_VAR_HEAD       \
   PyObject_HEAD                \
   Py_ssize_t ob_size;

typedef struct _object {
} PyObject;

typedef struct {
} PyVarObject;

/* A typical actual Python object */
typedef struct {
    int ob_exports;
    Py_ssize_t ob_alloc;
    char *ob_bytes;
} PyByteArrayObject;

(This is taken from Include/object.h in the CPython source.)

The #defines are used to construct generic 'object' structs (the typedef'd PyObject and PyVarObject) for use in appropriate code, but in actual Python objects the #defines are used directly instead of the object structs being embedded in them. Things are cast back and forth as necessary; in practice (and I believe perhaps in ANSI C theory) it's guaranteed that the actual memory layout of the start of a PyByteArrayObject and a PyVarObject are the same.

There are a number of advantages of this #define-based approach. The one that's visible here is that references to these polymorphic fields in actual structs do not require levels and levels of indirection through names that exist merely as containers. If p is a pointer to a PyByteArrayObject, you can directly refer to p->ob_refcnt instead of having to refer to p->b.a.ob_refcnt, where b and a are arbitrary names assigned to the PyVarObject and PyObject structs embedded in the PyByteArrayObject. This goes well with CPP macros to manipulate the various fields (actual functions, even inline ones, would require some actual casting). In particular it means that a CPP macro to manipulate ob_refcnt don't have to care whether you're dealing with a PyObject or a PyVarObject; with explicit structs, the former case would need p->a.ob_refcnt while the latter would need p->b.a.ob_refcnt.

(Some C compilers allow anonymous structs if the members are unique and this is now standardized in C11.)

Written on 03 January 2013.
« Some patterns for polymorphism in C
DTrace's stable providers are not good enough »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Jan 3 01:37:03 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.