Wandering Thoughts archives

2013-10-09

An interesting bug with module garbage collection in Python

In response to my entry on what happens when modules are destroyed, @eevee shared the issue that started it all:

@thatcks the confusion arose when the dev did `module_maker().some_function_returning_a_global()` and got None :)

In a subsequent exchange of tweets, we sorted out why this happens. What it boils down to is a module is not the same as the module's namespace and functions only hold a reference to the module namespace, not the module itself.

(Functions have a __module__ attribute but this a string, not a reference to the module itself.)

So here's what is going on. When this chunk of code runs module_maker() loads and returns a module as an anonymous object then the interpreter uses that anonymous module object to look up the function. Since the function does not hold a reference to the module itself, the module object is unreferenced after the lookup has finished and is thus immediately garbage collected. This garbage collection destroys the contents of the module namespace dictionary, but the dictionary itself is not garbage collected because the function holds a reference to it and the interpreter holds a reference to the function. Then the function's code runs and uses its reference to the dictionary to look up a (module) global, which finds the name and a None value for it.

(You would get even more comedy if the module function tried to call another module level function or create an instance of a module level class; this would produce mysterious 'TypeError: `NoneType' object is not callable' errors since the appropriate name is now bound to None instead of a callable thing.)

The workaround is straightforward; you just have to store the module object in a local variable before looking up the function so that a reference to it persists over the function call and thus avoids it being garbage collected.

The good news is that this weird behavior did wind up being accepted as a Python bug; it's issue 18214 and is fixed in the forthcoming Python 3.4. Given the views of the Python developers, it will probably never be fixed in Python 2 and will thus leave people with years of having to work around it.

(It's hopefully obvious why this is a bug. Given that modules and module namespaces are separate things and that a module's namespace can outlive it for various reasons, a module being garbage collected should not result in its namespace dictionary getting trashed. This sort of systematic destruction of module namespaces should only happen when it's really necessary, namely during interpreter shutdown.)

python/ModuleGCBug written at 00:23:10; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.