Exploring a surprise with equality in Python

July 8, 2014

Taken from @hackedy's tweet, here's an interesting Python surprise:

>>> {1: "one", True: "two"}
{1: 'two'}
>>> {0: "one", False: "two"}
{0: 'two'}

There are two things happening here to create this surprise. The starting point is this:

>>> print hash(1), hash(True)
1 1

At one level, Python has made True have the same hash value as 1. Actually that's not quite right, so let me show you the real story:

>>> isinstance(True, int)
True

Python has literally made bool, the type that True and False are instances of, be a subclass of int. They not merely look like numbers, they are numbers. As numbers their hash identity is their literal value of 1 or 0, and of course they also compare equal to literal 1 or 0. Since they hash to the same identity and compare equal, we run into the issue with 'custom' equalities and hashes in dictionaries where Python considers the two different objects to be the same key and everything gets confusing.

(That True and False hash to the same thing as 1 and 0 is probably not a deliberate choice. The internal bool type doesn't have a custom hash function; it just tacitly inherits the hash function of its parent, ie int. I believe that Python could change this if it wanted to, which would make the surprise here go away.)

The other thing is what happens when you create a dictionary with literal syntax, which is that Python generates bytecode that stores each initial value into the dictionary one after another in the order that you wrote them. It turns out that when you do a redundant store into a dictionary (ie you store something for a key that already exists), Python only replaces the value, not both the key and the value. This is why the result is not '{True: 'two'}'; only the value got overwritten in the second store.

(This decision is a sensible one because it may avoid object churn and the overhead associated with it. If Python replaced the key as well it would at least be changing more reference counts on the key objects. And under normal circumstances you're never going to notice the difference unless you're going out of your way to look.)

PS: It turns out that @hackedy beat me to discovering that bools are ints. Also the class documentation for bool says this explicitly (and notes that bool is one of the rare Python classes that can't be subclassed).

Written on 08 July 2014.
« Some thoughts on SAN long-term storage migration
What the differences are between Python bools and ints »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Jul 8 23:57:23 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.