Doing unsigned 32-bit integer math in Python

August 5, 2010

Modern versions of Python helpfully have infinite-precision integers. This is often useful (and they are intelligent), but there are periodic situations where I really do want to operate with fixed-size integers, rollover and all. My most recent need was computing a hash that was specified in terms of unsigned 32-bit operations (well, it was specified in C code that was doing such operations).

(It's common for hashes, checksums, and so on to be specified with this sort of arithmetic. If you need to duplicate the hash or checksum in pure Python, you're going to be faced with this.)

Embarrassingly, every time I run into this I go through the same exercise of working out how to do it and digging up my old code and convincing myself that it works. So I'm going to write this down once and hope that it sticks (and if not, I can look it up here later).

In pure Python for unsigned 32-bit arithmetic, it suffices to mask numbers with 0xffffffffL after every potentially overflowing operation. In my recent case, I wound up defining some convenience functions:

M32 = 0xffffffffL
def m32(n):
    return n & M32
def madd(a, b):
    return m32(a+b)
def msub(a, b):
    return m32(a-b)
def mls(a, b):
    return m32(a<<b)

(There is no need for an equivalent function for >>, as it can't overflow.)

I don't know if there's an equivalent version for signed 32-bit integer math, rollover and all; so far I haven't needed one.

If you have NumPy available, the simpler version is just:

from numpy import uint32
from numpy import int32

NumPy also provides signed and unsigned 8 bit, 16 bit, and 64 bit integers under the obvious names.

Note that you don't want to mix these types with regular Python integers if you want things to come out just right; you need to make sure that all numbers involved in your code are turned into the appropriate NumPy type. But if you do a bunch of computation with only a few original numbers, this may well be the easiest, most convenient approach. If you're copying code from another language, it will also likely keep your Python code looking as much like the original as possible instead of littering it with madd() and msub() calls.

Written on 05 August 2010.
« Good Solaris news from Oracle for once
A thing I miss in the modern X world: command line arguments »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Aug 5 00:11:43 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.