The importance of understanding Python's implementation
Joel Spolsky likes to talk about leaky abstractions, how the lower level details underneath abstractions inevitably leak through and affect your work with those abstractions. High level languages like Python are no exception, because how their abstractions are implemented has significant effects on how fast different sorts of code will run.
For instance, let's take lists and tuples, both of which CPython implements as arrays of objects. This has a number of consequences:
- indexing a list element is cheap, and all list elements are equally cheap to access.
- you always know the length of lists.
- popping the last element off is cheap.
- popping the first element is expensive, because you have to slide everything down.
- similarly, adding an element at the end is cheap but adding it at the front is expensive.
- concatenating lists is not cheap, because it requires copying one list onto the end of the other.
Other languages use different approaches, with different consequences; for example, Lisp uses singly-linked lists, making the front of the list the cheapest place to access and to add and remove elements. This matters when implementing things on top of lists; a Python stack should have the top of the stack at the end of the list, while a Lisp stack should have it at the start.
A Python stack implemented the 'wrong' way will have significantly worse performance, and there is very little in the language documentation to tell you which is wrong and which is right; you need to know something about how Python implements lists. (Perhaps you can guess it from Python having array indexing of lists.)
(My conscious awareness of this issue owes a significant debt to the discussion in the comments on my 'Minimizing object churn' entry.)
|
|