An operational explanation of Python metaclasses (part 1)

September 12, 2011

All of the explanations of metaclasses that I've read have started out by talking about the whole background and theory of operation of metaclasses. This approach doesn't work for me; by the time they get out of the background, I'm either asleep or my eyes have glazed over. So I'm going to tackle metaclasses from the other end, covering what you can do with them.

Part of the reason that metaclasses are complicated and confusing is that they can be used to do a number of mostly unrelated things. So to start out, let's talk about the classical and most common use of metaclasses: modifying a class as it's being created. This is more or less how things like Django's form and model definitions work, and it's what I did in my metaclass for namespaces.

(This is sort of like the kind of things that you can do with Lisp macros, although nowhere near as advanced.)

There are two spots where a metaclass can meddle in the creation of a class. A metaclass's __new__ is called before the class type object exists, is expected to return the newly created class object, and normally works by manipulating the 'class dictionary' of the class to be. A metaclass's __init__ is called after the class exists but before it has been completely finalized, and pretty much can only work by manipulating the new class object.

(This just like __new__ versus __init__ on conventional classes (cf), except that the 'object' you are dealing with is a class definition and the arguments to both functions come in a very specific form.)

Most metaclasses use __new__ instead of __init__. In general, most sophisticated changes are easier to do in __new__ because you don't have to worry about normal class magic getting in the way (for example, a function getting automatically converted to an unbound method when you try to retrieve it to modify it). In addition, because some things about a class are frozen at the moment that its class object is created, changing them can only be done in __new__; the obvious example is creating, modifying, or removing __slots__. You can add things to the class in __init__, and it may be clearer to do so there because you can simply set attributes directly.

(Properties do not have to be created in __new__ as far as I can see.)

Also, __new__ is free to return an existing class object. In theory you could use this to implement 'singleton classes'; in practice, I can't think of much use of this outside of something like Django, where the 'classes' are actually a little domain specific language to define things and where you might want two definitions of the same thing to result in the same actual class object (especially if you track state through the class object in the background).

The mechanics

__new__ and __init__ are called slightly differently; the signatures are:

class MiniMeta(type):
  def __new__(meta, cname, bases, cdict):
    return type.__new__(meta, cname, bases, \
                        cdict)

  def __init__(cls, cname, bases, cdict):
    return type.__init__(cls, cname, bases, \
                         cdict)

class Example(object):
  __metaclass__ = MiniMeta

(In real code you should use super() here.)

cname is the name of the class as in 'class Foo', bases is a tuple of the class's base classes, and cdict is what will be the class dictionary (or in the case of __init__, what has already been turned into the class dictionary). In __new__, meta is your metaclass itself; in __init__, cls is the class object for the new class.

__new__ should return a newly created class object. Normally your __new__ function will manipulate cdict and then use super() to continue creating the class, returning the result; if you're going to create the class before manipulating it, you might as well use __init__. The only thing __init__ can usefully manipulate is cls, since the other arguments have already been used to construct it.

(Technically __new__ can return anything it wants to, including an existing class or even a non-class object, but doing so is a great way to confuse everyone who ever reads your code.)

For reasons beyond the scope of this margin, your metaclass really must descend from type(). Subclassing object() instead by accident will cause all sorts of interesting failures with obscure error messages, like TypeError: 'MiniMeta' object is not callable.

Written on 12 September 2011.
« The weakness of the certificate authority model, illustrated
'Web of trust' is a security failure »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Sep 12 01:46:43 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.