My view of the right way to copy lists in Python

November 11, 2011

I recently wound up reading Python: copying a list the right way (it's old, but it made things like Hacker News recently); in it, its author argues that the clearer way to copy a list in Python is not 'new = old[:]' but 'new = list(old)'. In thinking about it since then, I've wound up more or less disagreeing with this.

It's true that the effects of old[:] are not immediately obvious (although this idiom has the advantage that it looks clearly magical; if you don't understand it, you know that you don't understand it). But I maintain that list(old) is also magical in an important but less obvious way: it's not obvious that list() returns a copy of the list.

One of the understandings of list() that I think you can reasonably have is that it's a function that turns things into lists, if the thing is plausibly list-like in the first place (otherwise it fails). In this view, if you call list() on something that's already a list it's perfectly sensible for list() to return it as-is; you asked list() to make it into a list, and list() is saying 'here you go, it already is'. In fact you may want this behavior.

If you are just learning Python, it is far from obvious that this understanding of list() is in fact wrong and it is guaranteed to make a copy of old. Further, there is no straightforward semantic reason that list() returns a copy; it does so ultimately because that's how it's documented to work. If list()' is an 'intuitive' way of making copies of lists, it's a misleading intuition. If you relied on it, you've just gotten lucky.

(Note that there are list()-like things which do not make copies.)

I'm not going to claim that old[:] is any more intuitive a way of making copies than list(old). I am just going to claim that it is not any less intuitive; in either case, you really have to learn this as an idiom. I personally prefer old[:], both because it looks more clearly magical and because I think it signals more strongly that a copy is being made (since slicing generally has to make new things).

(Note that if you have an arbitrary indexable object, there is no strong semantic requirement that old[:] not return old (and in fact tuples behave this way). There probably is a cultural expectation that it not do so if old is a mutable object, but that's because 'old[:]' has become the general Python idiom for 'make me a copy of this indexable thing'.)

Sidebar: a slight dive into the semantics of list()

If you know Python at a decent level, some of what I wrote above is making you frown. This is because list() is not actually a function; instead it is a class aka a type. Once you know that list() is actually a type, you have a much stronger case for expecting list(old) to return a copy; all classes/types return new instances when called (unless they go well out of their way to do otherwise).

Since this behavior is strongly embedded in what classes do and how they operate, I expect that people do assume as a matter of course that all classes return new instances even if they're called (validly) with existing instances of the class as initializers. The times when people are okay with some variant of singletons are when this creates almost no semantic difference, ie when the class instances are immutable once created.

(And now I think I have wandered far enough down this particular rathole for one entry.)

Written on 11 November 2011.
« Praise for systemd
(Not) parsing wikitext »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Nov 11 01:01:10 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.