Why I now believe that duck typed metaclasses are impossible in CPython
As I mentioned in my entry on fake versus real metaclasses, I've wound up a bit obsessed with the question
of whether it's possible to create a fully functional metaclass
that doesn't inherit from type. Call this a 'duck typed metaclass'
or if you want to be cute, a 'duck typed type' (DTT). As a result
of that earlier entry and some additional
exploration I now believe that it's impossible.
Let's go back to MetaclassFakeVsReal for a moment and look at the
fake metaclass M2:
class M2(object):
def __new__(self, name, bases, dct):
print "M2", name
return type(name, bases, dct)
class C2(object):
__metaclass__ = M2
class C4(C2):
pass
As we discovered, the problem is that C2 is not an instance of M2
and so (among other things) its subclass C4 will not invoke M2 when
it is being created. The real metaclass M1 avoided this problem by
instead using type.__new()__ in its __new__ method. So why
not work around the problem by making M2 do so too, like this:
class M2(object):
def __new__(self, name, bases, dct):
print "M2", name
return type.__new__(self, name, bases, dct)
Here's why:
TypeError: Error when calling the metaclass bases
type.__new__(M2): M2 is not a subtype of type
I believe that this is an old friend in a new
guise. Instances of M2 would normally be based on the C-level
structure for object (since it is a subclass of object), which
is not compatible with the C-level type structure that instances
of type and its subclasses need to use. So type says 'you cannot
do this' and walks away.
Given that we need C2 to be an instance of M2 so that things work
right for subclasses of C2 and we can't use type, we can try brute
force and fakery:
class M2(object):
def __new__(self, name, bases, dct):
print "M2", name
r = super(M2, self).__new__()
r.__dict__.update(dct)
r.__bases__ = bases
return r
This looks like it works in that C4 will now get created by M2.
However this is an illusion and I'll give you two examples of the
ensuing problems, each equally fatal.
Our first problem is creating instances of C2, ie the actual
objects that we will want to use in code. Instance creation is
fundamentally done by calling C2(), which means that M2 needs a
__call__ special method (so that C2, an instance of M2, becomes
callable). We'll try a version that delegates all of the work to type:
def __call__(self, *args, **kwargs):
print "M2 call", self, args, kwargs
return type.__call__(self, *args, **kwargs)
Unsurprisingly but unfortunately this doesn't work:
TypeError: descriptor '__call__' requires a 'type' object but received a 'M2'
Okay, fine, we'll try more or less the same trick as before (which is now very dodgy, but ignore that for now):
def __call__(self, *args, **kwargs):
print "M2 call", self, args, kwargs
r = super(M2, self).__new__(self)
r.__init__(*args, **kwargs)
return r
You can probably guess what's coming:
TypeError: object.__new__(X): X is not a type object (M2)
We are now well and truly up the creek because classes are the only
thing in CPython that can have instances. Classes are instances of
type and as we've seen we can't create something that is both an
instance of M2 (so that M2 is a real metaclass instead of a fake
one) and an instance of type. Classes without instances are obviously
not actually functional.
The other problem is that despite how it appears C4 is not actually
a subclass of C2 because of course classes are the only thing
in CPython that can have subclasses. In specific, attribute lookups
on even C4 itself will not look at attributes on C2:
>>> C2.dog = 10 >>> C4.dog AttributeError: 'M2' object has no attribute 'dog'
The __bases__ attribute that M2.__new__ glued on C4 (and C2)
is purely decorative. Again, looking attributes up through the chain of
bases (and the entire method resolution order)
is something that happens through code that is specific to instances of
type. I believe that much of it lives under the C-level function that
is type.__getattribute__, but some of it may be even more magically
intertwined into the guts of the CPython interpreter than that. And as
we've seen, we can't call type.__getattribute__ ourselves unless we
have something that is an instance of type.
Note that there is literally no attributes we can set on non-type
instances that will change this. On actual instances of type, things
like __bases__ and __mro__ are not actual attributes but are
instead essentially descriptors that look up and manipulate fields
in the C-level type struct. The actual code that does things like
attribute lookups uses the C-level struct fields directly, which is one
reason it requires genuine type instances; only genuine instances even
have those struct fields at the right places in memory.
(Note that attribute inheritance in subclasses is far from the only
attribute lookup problem we have. Consider accessing C2.afunction
and what you'd get back.)
Either problem is fatal, never mind both of them at once (and note
that our M2.__call__ is nowhere near a complete emulation of
what type.__call__ actually does). Thus as far as I can tell
there is absolutely no way to create a fully functional duck typed
metaclass in CPython. To do one you'd need access to the methods
and other machinery of type and type reserves that machinery
for things that are instances of type (for good reason).
I don't think that there's anything in general Python semantics that
require this, so another Python implementation might allow or support
enough to enable duck typed metaclasses. What blocks us in CPython is
how CPython implements type, object, and various core functionality
such as creating instances and doing attribute lookups.
(I tried this with PyPy and it failed with a different set of errors
depending on which bits of type I was trying to use. I don't have
convenient access to any other Python implementations.)
|
|