An object identity gotcha in Python

June 2, 2006

Consider the problem of implementing a modified LIFO stack that has an additional O(1) operation called min, that gives back the smallest item in the stack. For now, let's pretend that Python arrays are O(1) (or that someone has implemented a native queue class); in many respects they are as far as Python code is concerned.

The simple implementation is to keep two stacks, the main stack and one that you only push new minimum items onto. Then push and pop are something like this (omitting error checking and so on):

def push(self, itm):
  if not self.minstk or \
     itm < self.minstk[-1]:

def pop(self):
  top = self.stk.pop()
  if top is self.minstk[-1]:
  return top

Unfortunately, this code has a small bug: it detonates if you push the current minimum object onto the stack a second time. This might be a pernicious bug that lingers unnoticed for some time, since normal Python usage for something like this will probably have entirely distinct objects.

(The quick fix is to use '<=' in push instead of just '<', which does grow the minstk stack a bit more in some circumstances.)

The really tricky bit about this is that Python will sometimes give you duplicate items behind your back. For example, all of the small integers from -5 to 99 are currently interned by Python; any use of one (including one you get through calculations) give you back the same object.

This obviously only happens for immutable objects, but when it happens is implementation defined (and thus can change from Python version to Python version). It's definitely something to bear in mind when writing generic code that uses object identity.

Written on 02 June 2006.
« Link: a Unix sysadmin rosetta stone
The fundamental problem of spam »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Jun 2 02:39:06 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.