Why I like Python's large integer support

July 18, 2007

The simple explanation of why I like Python's large integer support (somewhat misleadingly called longs for historical reasons) is that they make it convenient to deal with arbitrarily large binary numbers. These crop up surprisingly often, sometimes because a large bunch of bits is the natural representation of things like IPv6 addresses, and sometimes because they make algorithms clearer and simpler.

For example, take the job of generating a random password that's drawn from a given alphabet. The simple approach to doing this is:

```from math import log, ceil
```
```def genpass(nchars, alphabet):
alen = len(alphabet)
bits = int(ceil(log(alen, 2) * nchars))
rnum = getnbits(bits)
pw = ''
while nchars:
# extract a character's worth
# of randomness.
idx = rnum % alen
rnum = rnum // alen
pw += alphabet[idx]
nchars -= 1
return pw
```

This assumes we have a `getnbits()` routine that returns a random number with (at least) `bits` bits of randomness.

If Python did not have large integers, `getnbits()` would have to return a much more complicated data structure to represent all those bits, and extracting a character's worth of randomness would be much more complicated. Because it does have large integers, I can express the algorithm simply and directly and thus be pretty confidant that this is really generating random passwords.

(This is in fact my new random password generator, shorn of the bits that specify the alphabet, let me say how many characters I want, and print the result.)

Sidebar: an implementation of `getnbits()`

This requires `/dev/urandom`, which is present on most modern Unixes, and omits error checking in the interests of simplicity:

```def getnbits(nbits):
fp = open("/dev/urandom", "r")
bytes = (nbits + 7) // 8
fp.close()
base = 0L
for c in list(buf):
base = (base << 8L) + ord(c)
return base
```

Again we can see how simple this is once we have arbitrarily large integers.

(Using `list(buf)` is strictly speaking unnecessary but makes it clear that we are deliberately using that string as a list of characters.)

If we wanted to be even trickier, the whole `for` loop could be rewritten as `reduce(lambda a, b: (a << 8L) + ord(b), buf, 0L)`. This probably serves as a good illustration of why Python 3K is trying to kill `reduce` off.

From 68.167.148.211 at 2007-07-19 09:17:20:
Written on 18 July 2007.

Search:
Atom Syndication: Recent Comments.

Last modified: Wed Jul 18 23:14:43 2007
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.