Exploring a surprise with equality in Python
Taken from @hackedy's tweet, here's an interesting Python surprise:
>>> {1: "one", True: "two"} {1: 'two'} >>> {0: "one", False: "two"} {0: 'two'}
There are two things happening here to create this surprise. The starting point is this:
>>> print hash(1), hash(True) 1 1
At one level, Python has made True
have the same hash value as 1
. Actually that's not quite right,
so let me show you the real story:
>>> isinstance(True, int) True
Python has literally made bool
, the type that True
and False
are instances of, be a subclass of int
. They not merely look like
numbers, they are numbers. As numbers their hash identity is their literal value of 1
or 0
, and
of course they also compare equal to literal 1
or 0
. Since they
hash to the same identity and compare equal, we run into the issue
with 'custom' equalities and hashes in dictionaries
where Python considers the two different objects to be the same key
and everything gets confusing.
(That True
and False
hash to the same thing as 1
and 0
is
probably not a deliberate choice. The internal bool
type doesn't
have a custom hash function; it just tacitly inherits the hash
function of its parent, ie int
. I believe that Python could change
this if it wanted to, which would make the surprise here go away.)
The other thing is what happens when you create a dictionary with
literal syntax, which is that Python generates bytecode that stores
each initial value into the dictionary one after another in the
order that you wrote them. It turns out that when you do a redundant
store into a dictionary (ie you store something for a key that
already exists), Python only replaces the value, not both the key
and the value. This is why the result is not '{True: 'two'}
';
only the value got overwritten in the second store.
(This decision is a sensible one because it may avoid object churn and the overhead associated with it. If Python replaced the key as well it would at least be changing more reference counts on the key objects. And under normal circumstances you're never going to notice the difference unless you're going out of your way to look.)
PS: It turns out that @hackedy beat me to discovering that bools
are ints.
Also the class documentation for bool
says this explicitly (and
notes that bool
is one of the rare Python classes that can't be
subclassed).
|
|