An alterate pattern for polymorphism in C
As I mentioned in yesterday's entry, CPython
(the C-based main implementation of Python) uses an interesting variant
on struct-at-start based polymorphism. To put it simply, it uses
#defines instead of a struct. This probably sounds odd, so let me
show you the slightly simplified CPython 2.7.x code:
#define PyObject_HEAD \
Py_ssize_t ob_refcnt; \
struct _typeobject *ob_type;
#define PyObject_VAR_HEAD \
PyObject_HEAD \
Py_ssize_t ob_size;
typedef struct _object {
PyObject_HEAD
} PyObject;
typedef struct {
PyObject_VAR_HEAD
} PyVarObject;
/* A typical actual Python object */
typedef struct {
PyObject_VAR_HEAD
int ob_exports;
Py_ssize_t ob_alloc;
char *ob_bytes;
} PyByteArrayObject;
(This is taken from Include/object.h in the CPython source.)
The #defines are used to construct generic 'object' structs (the
typedef'd PyObject and PyVarObject) for use in appropriate code, but
in actual Python objects the #defines are used directly instead of
the object structs being embedded in them. Things are cast back
and forth as necessary; in practice (and I believe perhaps in ANSI C
theory) it's guaranteed that the actual memory layout of the start of a
PyByteArrayObject and a PyVarObject are the same.
There are a number of advantages of this #define-based approach. The
one that's visible here is that references to these polymorphic fields
in actual structs do not require levels and levels of indirection
through names that exist merely as containers. If p is a pointer
to a PyByteArrayObject, you can directly refer to p->ob_refcnt
instead of having to refer to p->b.a.ob_refcnt, where b and a
are arbitrary names assigned to the PyVarObject and PyObject structs
embedded in the PyByteArrayObject. This goes well with CPP macros to
manipulate the various fields (actual functions, even inline ones,
would require some actual casting). In particular it means that a CPP
macro to manipulate ob_refcnt don't have to care whether you're
dealing with a PyObject or a PyVarObject; with explicit structs, the
former case would need p->a.ob_refcnt while the latter would need
p->b.a.ob_refcnt.
(Some C compilers allow anonymous structs if the members are unique
and this is now standardized in C11.)
|
|