Wandering Thoughts archives

2006-06-02

An object identity gotcha in Python

Consider the problem of implementing a modified LIFO stack that has an additional O(1) operation called min, that gives back the smallest item in the stack. For now, let's pretend that Python arrays are O(1) (or that someone has implemented a native queue class); in many respects they are as far as Python code is concerned.

The simple implementation is to keep two stacks, the main stack and one that you only push new minimum items onto. Then push and pop are something like this (omitting error checking and so on):

def push(self, itm):
  self.stk.append(itm)
  if not self.minstk or \
     itm < self.minstk[-1]:
    self.minstk.append(itm)

def pop(self):
  top = self.stk.pop()
  if top is self.minstk[-1]:
    self.minstk.pop()
  return top

Unfortunately, this code has a small bug: it detonates if you push the current minimum object onto the stack a second time. This might be a pernicious bug that lingers unnoticed for some time, since normal Python usage for something like this will probably have entirely distinct objects.

(The quick fix is to use '<=' in push instead of just '<', which does grow the minstk stack a bit more in some circumstances.)

The really tricky bit about this is that Python will sometimes give you duplicate items behind your back. For example, all of the small integers from -5 to 99 are currently interned by Python; any use of one (including one you get through calculations) give you back the same object.

This obviously only happens for immutable objects, but when it happens is implementation defined (and thus can change from Python version to Python version). It's definitely something to bear in mind when writing generic code that uses object identity.

python/ObjectIdentityGotcha written at 02:39:06; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.