2014-07-09
What the differences are between Python bools and ints
I mentioned in the previous entry that Python's
bool class is actually a subclass of int (and the bool docstring
will tell you this if you bother to read it with help() before,
say, diving into the CPython source code like a system programmer). Since I was just looking at this,
I might as well write down the low-level differences between ints and
bools. Bools have:
- a custom
__repr__that reportsTrueorFalseinstead of the numeric value; this is also used as the custom__str__forbool.(The code is careful to
internthese strings so that no matter how many times yourepr()orstr()a boolean, only one copy of the literal 'True' or 'False' string will exist.) - a
__new__that returns either the globalTrueobject or the globalFalseobject depending on the truth value of what it's given. - custom functions for
&,|, and^that implement boolean algebra instead of the standard bitwise operations if both arguments are eitherTrueorFalse. Note that eg 'True & 1' results in a bitwise operation and anintobject, even though1is strongly equal toTrue.
That's it.
I'm not quite sure how bool blocks being subclassed and I'm not
curious enough right now to work it out.
Update: see the comments for the explanation.
The global True and False objects are of course distinct from
what is in effect the global 0 and 1 objects that are all but
certain to exist. This means that their
id() is different (at least in CPython), since the id() is the
memory address of their C-level object struct.
(In modern versions of both CPython 2 and CPython 3 it turns out
that global 0 and 1 objects are guaranteed to exist, because
'small integers' between -5 and 257 are actually preallocated as
the interpreter is initializing itself.)
2014-07-08
Exploring a surprise with equality in Python
Taken from @hackedy's tweet, here's an interesting Python surprise:
>>> {1: "one", True: "two"}
{1: 'two'}
>>> {0: "one", False: "two"}
{0: 'two'}
There are two things happening here to create this surprise. The starting point is this:
>>> print hash(1), hash(True) 1 1
At one level, Python has made True have the same hash value as 1. Actually that's not quite right,
so let me show you the real story:
>>> isinstance(True, int) True
Python has literally made bool, the type that True and False
are instances of, be a subclass of int. They not merely look like
numbers, they are numbers. As numbers their hash identity is their literal value of 1 or 0, and
of course they also compare equal to literal 1 or 0. Since they
hash to the same identity and compare equal, we run into the issue
with 'custom' equalities and hashes in dictionaries
where Python considers the two different objects to be the same key
and everything gets confusing.
(That True and False hash to the same thing as 1 and 0 is
probably not a deliberate choice. The internal bool type doesn't
have a custom hash function; it just tacitly inherits the hash
function of its parent, ie int. I believe that Python could change
this if it wanted to, which would make the surprise here go away.)
The other thing is what happens when you create a dictionary with
literal syntax, which is that Python generates bytecode that stores
each initial value into the dictionary one after another in the
order that you wrote them. It turns out that when you do a redundant
store into a dictionary (ie you store something for a key that
already exists), Python only replaces the value, not both the key
and the value. This is why the result is not '{True: 'two'}';
only the value got overwritten in the second store.
(This decision is a sensible one because it may avoid object churn and the overhead associated with it. If Python replaced the key as well it would at least be changing more reference counts on the key objects. And under normal circumstances you're never going to notice the difference unless you're going out of your way to look.)
PS: It turns out that @hackedy beat me to discovering that bools
are ints.
Also the class documentation for bool says this explicitly (and
notes that bool is one of the rare Python classes that can't be
subclassed).
2014-07-05
Another reason to use frameworks like Django
The traditional reason to use web app frameworks like Django is that doing saves you time and perhaps gives you a more solid and polished result, possibly with useful extra features like Django's admin interface. But it has recently struck me that in many situations there is another interesting reason for using frameworks (or a defence of doing so instead of writing your own code).
Let's start by assuming that your application really needs at least some of the functionality you're using from the framework. For example, perhaps you're using the ORM and database functionality because that's what the framework makes easiest (this is our reason) but you really need the URL routing and HTML form handling and validation. Regardless of whether or not you used a framework, your application needs some code somewhere that does all of this necessary work. With a framework, the code mostly lives in the framework and you call the framework; without a framework, you would have to write your own code for it (and you use it directly). The practical reality is that the code for the functionality your application genuinely needs has to come from somewhere, either from a framework (if you use one) or from your own work and code.
If you write your own code, what are the odds that it will be as well documented and as solid as the code in a framework? Which will likely be easier for a co-worker to pick up later, custom code that you wrote from scratch or code that calls a standard framework in a standard or relatively standard way? If you only need a little bit of functionality and thus only need to write a little bit of code, this can certainly work out. But if you need a lot of functionality, so much that you're duplicating a lot of what a framework does, well, I am not so optimistic, because in effect what you're really doing is creating a custom one-off framework.
This suggests an obvious way to balance out whether or not to use a framework (or from some perspectives, to inflict either a framework or your own collection of code on your co-workers). To maximize the benefits of using a framework you should be writing as little of your own code as possible, talking to the framework in its standard way, and the framework needs to be well documented, because all of this plays to the strengths of the framework over your own code. If the framework is hard to pick up, your code to deal with it is complex, and replacing it would only be a modest amount of custom code, well, the case for your own code is strong.
(I'm not sure this way of thinking has anything to say about the ever popular arguments over minimal frameworks versus big frameworks with 'batteries included' and good PR. A big framework might be worse because it requires you to learn more before you can start using the corner of it you need, or it might be better because you need less custom code to connect various minimal components together. It certainly feels like how much of the framework you need ought to matter, but I'm not sure this intuition is correct.)