Wandering Thoughts archives

2011-06-12

Noting some wandering thoughts on the occasion of an anniversary

I'm a peculiar person, and so although I wrote my first entries here on June 11th, I consider June 12th to be WanderingThoughts' real anniversary because it's when I decided to do it again and thus made the first day not a one-off thing. That was six years ago, which in some ways really boggles me.

I'm not someone for anniversaries; I wrote something for the first anniversary, but never bothered noting the others (or other significant numbers, like writing 1,000 entries or 2,000 entries).

I'm a different person than I was six years ago, but then almost everyone is. I don't know how much WanderingThoughts has contributed to the changes, though; most of what I think of as the big changes have happened outside it and have had nothing to do with it. Still, I probably write better and clearer, I'm pretty sure that I both type and write faster than I used to, and I've probably benefited from thinking through various things in the process of writing entries.

And it's been fun (well, enjoyable). That's really why I've done it for six years and counting, that and the fact that I'm terribly stubborn and very set in my habits once they've been established.

Consider this a marker, dropped in the stream of time. I was here, now, looking both forward and backwards.

(If nothing else, looking backwards is one benefit of blogs.)

AnniversarySix written at 20:46:18; Add Comment

Why [], {}, and () are cheap in Python

Here's a trick question: are the following two lines semantically equivalent?

a = []
a = list()

In a normal Python environment they certainly have the same effect, but the answer is that no, they are not semantically equivalent because there is always the possibility that the name list has been rebound in the current namespace. As a result, when Python executes the second line it's obliged to go through the bother of looking up list's name to determine that it actually is the built-in. By contrast, [] here can only ever mean one thing in Python; its semantics are fixed and so CPython is free to fast-path things based on that.

(And it does. [] compiles straight down to a specific bytecode operation, BUILD_LIST. list() compiles to a lookup and a function call.)

That's half of the reason that [] is a cheap operation. The other half is that CPython keeps a stash of preallocated list objects ready to be reused. As a result, CPython needs to do no memory allocations if it needs an empty list when this stash is not empty.

(For a non-empty list, it needs to allocate and zero the memory block for the array of pointers to list items.)

Dictionaries have similar optimizations, although the dictionary setup code has slightly more work to do in order to give you a valid empty dictionary.

Tuple creation has even more optimization. CPython keeps a much larger stash of various sized tuples so that it can almost certainly immediately give you, eg, a 4-element tuple without any fuss. Since tuples are immutable, () can be a very special case:

>>> () is ()
True

In other words, there is only one empty tuple in CPython. Once something creates it, it will live forever. Giving you an empty tuple is as simple as increasing this one empty tuple's reference count and returning it.

(This entry was sparked by a recent comment on this entry; it caused me to get curious about just how efficient things like '[]' were. I had assumed that CPython had optimized them, but why assume when I could go find out.)

Sidebar: how much this matters

; python -m timeit 'a = ()'
10000000 loops, best of 3: 0.0315 usec per loop
; python -m timeit 'a = []'
10000000 loops, best of 3: 0.0573 usec per loop
; python -m timeit 'a = list()'
1000000 loops, best of 3: 0.198 usec per loop

And here's some more, with some interesting results:

; python -m timeit -s 'l = list' 'a = l()'
10000000 loops, best of 3: 0.176 usec per loop
; python -m timeit -s 'class A(object): pass' 'a = A()'
10000000 loops, best of 3: 0.16 usec per loop

If you give the A class a do-nothing __init__, it slows down to around 0.38 usec per loop. I have no idea why 'A()' is so fast without an __init__; before I measured, I would have expected it to be clearly slower than list().

(This is all from a 64-bit Fedora 14 machine with Python 2.7.)

PS: this is where Zed Shaw gets very irritated with me, so don't bet the farm on these timing results.

python/CheapListDictTupleCreation written at 01:13:05; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.