() are cheap in Python
Here's a trick question: are the following two lines semantically equivalent?
a =  a = list()
In a normal Python environment they certainly have the same effect, but
the answer is that no, they are not semantically equivalent because
there is always the possibility that the name
list has been rebound
in the current namespace. As a result, when Python executes the second
line it's obliged to go through the bother of looking up
to determine that it actually is the built-in. By contrast,
can only ever mean one thing in Python; its semantics are fixed and so
CPython is free to fast-path things based on that.
(And it does.
 compiles straight down to a specific bytecode
list() compiles to a lookup and a function
That's half of the reason that
 is a cheap operation. The other half
is that CPython keeps a stash of preallocated list objects ready to be
reused. As a result, CPython needs to do no memory allocations if it
needs an empty list when this stash is not empty.
(For a non-empty list, it needs to allocate and zero the memory block for the array of pointers to list items.)
Dictionaries have similar optimizations, although the dictionary setup code has slightly more work to do in order to give you a valid empty dictionary.
Tuple creation has even more optimization. CPython keeps a much larger
stash of various sized tuples so that it can almost
certainly immediately give you, eg, a 4-element tuple without any fuss.
Since tuples are immutable,
() can be a very special case:
>>> () is () True
In other words, there is only one empty tuple in CPython. Once something creates it, it will live forever. Giving you an empty tuple is as simple as increasing this one empty tuple's reference count and returning it.
(This entry was sparked by a recent comment on this entry; it caused me to get curious about just how
efficient things like '
' were. I had assumed that CPython had
optimized them, but why assume when I could go find out.)
Sidebar: how much this matters
; python -m timeit 'a = ()' 10000000 loops, best of 3: 0.0315 usec per loop ; python -m timeit 'a = ' 10000000 loops, best of 3: 0.0573 usec per loop ; python -m timeit 'a = list()' 1000000 loops, best of 3: 0.198 usec per loop
And here's some more, with some interesting results:
; python -m timeit -s 'l = list' 'a = l()' 10000000 loops, best of 3: 0.176 usec per loop ; python -m timeit -s 'class A(object): pass' 'a = A()' 10000000 loops, best of 3: 0.16 usec per loop
If you give the A class a do-nothing
__init__, it slows down to
around 0.38 usec per loop. I have no idea why '
A()' is so fast without
__init__; before I measured, I would have expected it to be
clearly slower than
(This is all from a 64-bit Fedora 14 machine with Python 2.7.)
PS: this is where Zed Shaw gets very irritated with me, so don't bet the farm on these timing results.