Wandering Thoughts archives

2014-07-09

What the differences are between Python bools and ints

I mentioned in the previous entry that Python's bool class is actually a subclass of int (and the bool docstring will tell you this if you bother to read it with help() before, say, diving into the CPython source code like a system programmer). Since I was just looking at this, I might as well write down the low-level differences between ints and bools. Bools have:

  • a custom __repr__ that reports True or False instead of the numeric value; this is also used as the custom __str__ for bool.

    (The code is careful to intern these strings so that no matter how many times you repr() or str() a boolean, only one copy of the literal 'True' or 'False' string will exist.)

  • a __new__ that returns either the global True object or the global False object depending on the truth value of what it's given.

  • custom functions for &, |, and ^ that implement boolean algebra instead of the standard bitwise operations if both arguments are either True or False. Note that eg 'True & 1' results in a bitwise operation and an int object, even though 1 is strongly equal to True.

That's it.

I'm not quite sure how bool blocks being subclassed and I'm not curious enough right now to work it out.
Update: see the comments for the explanation.

The global True and False objects are of course distinct from what is in effect the global 0 and 1 objects that are all but certain to exist. This means that their id() is different (at least in CPython), since the id() is the memory address of their C-level object struct.

(In modern versions of both CPython 2 and CPython 3 it turns out that global 0 and 1 objects are guaranteed to exist, because 'small integers' between -5 and 257 are actually preallocated as the interpreter is initializing itself.)

BoolVsInt written at 00:27:09; Add Comment

2014-07-08

Exploring a surprise with equality in Python

Taken from @hackedy's tweet, here's an interesting Python surprise:

>>> {1: "one", True: "two"}
{1: 'two'}
>>> {0: "one", False: "two"}
{0: 'two'}

There are two things happening here to create this surprise. The starting point is this:

>>> print hash(1), hash(True)
1 1

At one level, Python has made True have the same hash value as 1. Actually that's not quite right, so let me show you the real story:

>>> isinstance(True, int)
True

Python has literally made bool, the type that True and False are instances of, be a subclass of int. They not merely look like numbers, they are numbers. As numbers their hash identity is their literal value of 1 or 0, and of course they also compare equal to literal 1 or 0. Since they hash to the same identity and compare equal, we run into the issue with 'custom' equalities and hashes in dictionaries where Python considers the two different objects to be the same key and everything gets confusing.

(That True and False hash to the same thing as 1 and 0 is probably not a deliberate choice. The internal bool type doesn't have a custom hash function; it just tacitly inherits the hash function of its parent, ie int. I believe that Python could change this if it wanted to, which would make the surprise here go away.)

The other thing is what happens when you create a dictionary with literal syntax, which is that Python generates bytecode that stores each initial value into the dictionary one after another in the order that you wrote them. It turns out that when you do a redundant store into a dictionary (ie you store something for a key that already exists), Python only replaces the value, not both the key and the value. This is why the result is not '{True: 'two'}'; only the value got overwritten in the second store.

(This decision is a sensible one because it may avoid object churn and the overhead associated with it. If Python replaced the key as well it would at least be changing more reference counts on the key objects. And under normal circumstances you're never going to notice the difference unless you're going out of your way to look.)

PS: It turns out that @hackedy beat me to discovering that bools are ints. Also the class documentation for bool says this explicitly (and notes that bool is one of the rare Python classes that can't be subclassed).

EqualityDictSurprise written at 23:57:23; Add Comment

2014-07-05

Another reason to use frameworks like Django

The traditional reason to use web app frameworks like Django is that doing saves you time and perhaps gives you a more solid and polished result, possibly with useful extra features like Django's admin interface. But it has recently struck me that in many situations there is another interesting reason for using frameworks (or a defence of doing so instead of writing your own code).

Let's start by assuming that your application really needs at least some of the functionality you're using from the framework. For example, perhaps you're using the ORM and database functionality because that's what the framework makes easiest (this is our reason) but you really need the URL routing and HTML form handling and validation. Regardless of whether or not you used a framework, your application needs some code somewhere that does all of this necessary work. With a framework, the code mostly lives in the framework and you call the framework; without a framework, you would have to write your own code for it (and you use it directly). The practical reality is that the code for the functionality your application genuinely needs has to come from somewhere, either from a framework (if you use one) or from your own work and code.

If you write your own code, what are the odds that it will be as well documented and as solid as the code in a framework? Which will likely be easier for a co-worker to pick up later, custom code that you wrote from scratch or code that calls a standard framework in a standard or relatively standard way? If you only need a little bit of functionality and thus only need to write a little bit of code, this can certainly work out. But if you need a lot of functionality, so much that you're duplicating a lot of what a framework does, well, I am not so optimistic, because in effect what you're really doing is creating a custom one-off framework.

This suggests an obvious way to balance out whether or not to use a framework (or from some perspectives, to inflict either a framework or your own collection of code on your co-workers). To maximize the benefits of using a framework you should be writing as little of your own code as possible, talking to the framework in its standard way, and the framework needs to be well documented, because all of this plays to the strengths of the framework over your own code. If the framework is hard to pick up, your code to deal with it is complex, and replacing it would only be a modest amount of custom code, well, the case for your own code is strong.

(I'm not sure this way of thinking has anything to say about the ever popular arguments over minimal frameworks versus big frameworks with 'batteries included' and good PR. A big framework might be worse because it requires you to learn more before you can start using the corner of it you need, or it might be better because you need less custom code to connect various minimal components together. It certainly feels like how much of the framework you need ought to matter, but I'm not sure this intuition is correct.)

FrameworkUsageReason written at 01:31:39; Add Comment

By day for July 2014: 5 8 9; before July; after July.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.