My view of the right way to copy lists in Python
I recently wound up reading Python: copying a list the right way (it's old, but it made
things like Hacker News recently); in it, its author argues that the
clearer way to copy a list in Python is not 'new = old[:]
' but 'new =
list(old)
'. In thinking about it since then, I've wound up more or less
disagreeing with this.
It's true that the effects of old[:]
are not immediately obvious
(although this idiom has the advantage that it looks clearly magical;
if you don't understand it, you know that you don't understand
it). But I maintain that list(old)
is also magical in an important
but less obvious way: it's not obvious that list()
returns a
copy of the list.
One of the understandings of list()
that I think you can reasonably
have is that it's a function that turns things into lists, if the thing
is plausibly list-like in the first place (otherwise it fails). In this
view, if you call list()
on something that's already a list it's
perfectly sensible for list()
to return it as-is; you asked list()
to make it into a list, and list()
is saying 'here you go, it already
is'. In fact you may want this behavior.
If you are just learning Python, it is far from obvious that this
understanding of list()
is in fact wrong and it is guaranteed to make
a copy of old
. Further, there is no straightforward semantic reason
that list()
returns a copy; it does so ultimately because that's how
it's documented to work. If list()
' is an 'intuitive' way of making
copies of lists, it's a misleading intuition. If you relied on it,
you've just gotten lucky.
(Note that there are list()
-like things which do not make copies.)
I'm not going to claim that old[:]
is any more intuitive a way
of making copies than list(old)
. I am just going to claim that
it is not any less intuitive; in either case, you really have to
learn this as an idiom. I personally prefer old[:]
, both because
it looks more clearly magical and because I think it signals more
strongly that a copy is being made (since slicing generally has to
make new things).
(Note that if you have an arbitrary indexable object, there is no strong
semantic requirement that old[:]
not return old
(and in fact tuples
behave this way). There probably is a cultural expectation that it not
do so if old
is a mutable object, but that's because 'old[:]
' has
become the general Python idiom for 'make me a copy of this indexable
thing'.)
Sidebar: a slight dive into the semantics of list()
If you know Python at a decent level, some of what I wrote above
is making you frown. This is because list()
is not actually a
function; instead it is a class aka a type. Once you know that
list()
is actually a type, you have a much stronger case for
expecting list(old)
to return a copy; all classes/types return
new instances when called (unless they go well out of their way to
do otherwise).
Since this behavior is strongly embedded in what classes do and how they operate, I expect that people do assume as a matter of course that all classes return new instances even if they're called (validly) with existing instances of the class as initializers. The times when people are okay with some variant of singletons are when this creates almost no semantic difference, ie when the class instances are immutable once created.
(And now I think I have wandered far enough down this particular rathole for one entry.)
Comments on this page:
|
|