Minimizing object churn to optimize Python code
November 5, 2005
One of the interesting things about writing code in high level garbage collected languages is how your optimization concerns can change. One optimization in all GC'd languages, Python included, is minimizing the creation of new objects and maximizing the reuse of old ones; the fewer new objects you create, the fewer cycles garbage collection eats, the less memory gets fragmented, and you often even use less memory overall.
Minimizing object churn is behind several well-known Python (anti-)patterns, such as 'avoid repeated string concatenation to build a string'.
(The simple approach to accumulating one big string from a bunch of
function calls is to concatenate each value to the building result
string as you get it. But this churns objects (and copies data) at
every concatenation; the right pattern is to put all the function
results in a list and then use
A lot of object churn avoidance can be summed up as leave objects alone if at all possible. This is especially important for large immutable objects like strings; any time you fiddle with a string you wind up with a new object. This means that you want to push fiddling with objects as close to their creation as possible, because this maximizes the chance for sharing the same object.
When you write code, look at it with an eye to how many objects it churns through, and how necessary those are. Some techniques include:
Sidebar: saving string memory with
* * *
Atom feeds are available; see the bottom of most pages.