Wandering Thoughts archives

2005-10-03

Some important notes on getting all objects in Python

It turns out that I'm wrong about several things I mentioned in GetAllObjects, although the code there is still useful and as correct as you can reasonably get. However, it does have a few limitations and may miss objects under some circumstances.

First, gc.get_objects actually returns all container objects. In specific, it returns all objects that can participate in reference cycles; this necessarily includes all container objects (dicts, tuples, and lists), but also include other types as well. (My code that seemed to say otherwise was in error; I didn't do a proper breadth-first traversal of the list.)

Second, it's possible that expanding gc.get_objects may not get all objects. The main way this can happen is that gc.get_objects can't see objects that are only referred to from C code, for example if a compiled extension module is holding on to an object for later use without creating a visible name binding. (One example of this is the signal module, which holds an internal reference to any function set as a signal handler.)

If you need a completely accurate count, you need to use a debug build of Python. This keeps an internal list of all live dynamically allocated Python objects and makes it available via some additional functions in the sys module. (Naturally this slows the interpreter down and makes it use more memory.)

Even this has an omission: it lists only 'heap' objects, those that have been dynamically allocated. Python has a certain number of 'static' objects, such as type objects in the C code (instead of being created, their names just get registered with the Python interpreter). There are also static plain objects, for example True, False, and None.

However, many of these static objects will appear on the expanded gc.get_objects list. This is because they are referred to by live objects and gc.get_referents is happy to include them in its results. (This may not be too useful for object usage counting, since you can't get rid of static objects anyways.)

I owe a debt of thanks to Martin v. Löwis, who graciously took the time to correct my misconceptions and errors, and explain things to me. (Any remaining errors are of course my fault.)

(The charm of blogging is that I get to make mistakes like this in public. On the upside, I now know a bunch more about the insides of the CPython implementation than I used to.)

python/GetAllObjectsII written at 02:29:43;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.