Clever large integers

August 10, 2007

Where bitmap is a Python long, consider the innocent looking Python expression:

bitmap = bitmap & ~(1L << bit)

This is the perfectly normal way of clearing a bit in a bitmap, so it looks like there's nothing peculiar going on here.

Except that Python longs are of indefinite (and theoretically infinite) length, and what makes this expression work in normal integers is that the two pieces have the same (finite) size. So Python is clearly not just doing straight binary operations here; the implementation of ~ and & for longs has to be more complicated than just manipulating bit representations.

All of this goes to illustrate that sometimes features aren't as simple as they look.

As it turns out, this cleverness has a cost, at least in Python. If you already know that the bit is set, clearing it with bitmap - (1L << bit) starts out being about twice as fast as the AND approach (when operating on bit 0) and gets better from there. I'm ambivalent about whether or not you want to check the bit's status ahead of setting or clearing it; while it may avoid extra object allocations and deallocations, it's relatively expensive and I haven't completely convinced myself that it actually avoids all that much object churn.

(Checking plus subtraction is usually slower than the AND approach alone; it gets slightly faster when we start operating around bit 2048 and up. Of course, large bits is exactly where you probably want a faster representation of your bitmaps than a large integer, which should speed everything up.)

Written on 10 August 2007.
« A gotcha with the format of dump archives
On the performance of Python longs being used as bitmaps »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Aug 10 23:21:47 2007
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.