An object identity gotcha in Python
Consider the problem of implementing a modified LIFO stack that has an
additional O(1) operation called min
, that gives back the smallest
item in the stack. For now, let's pretend that Python arrays are O(1)
(or that someone has implemented a native queue class); in many respects
they are as far as Python code is concerned.
The simple implementation is to keep two stacks, the main stack and one that you only push new minimum items onto. Then push and pop are something like this (omitting error checking and so on):
def push(self, itm): self.stk.append(itm) if not self.minstk or \ itm < self.minstk[-1]: self.minstk.append(itm) def pop(self): top = self.stk.pop() if top is self.minstk[-1]: self.minstk.pop() return top
Unfortunately, this code has a small bug: it detonates if you push the current minimum object onto the stack a second time. This might be a pernicious bug that lingers unnoticed for some time, since normal Python usage for something like this will probably have entirely distinct objects.
(The quick fix is to use '<=
' in push
instead of just '<
',
which does grow the minstk stack a bit more in some circumstances.)
The really tricky bit about this is that Python will sometimes give you duplicate items behind your back. For example, all of the small integers from -5 to 99 are currently interned by Python; any use of one (including one you get through calculations) give you back the same object.
This obviously only happens for immutable objects, but when it happens is implementation defined (and thus can change from Python version to Python version). It's definitely something to bear in mind when writing generic code that uses object identity.
|
|