My view of the right way to copy lists in Python

November 11, 2011

I recently wound up reading Python: copying a list the right way (it's old, but it made things like Hacker News recently); in it, its author argues that the clearer way to copy a list in Python is not 'new = old[:]' but 'new = list(old)'. In thinking about it since then, I've wound up more or less disagreeing with this.

It's true that the effects of old[:] are not immediately obvious (although this idiom has the advantage that it looks clearly magical; if you don't understand it, you know that you don't understand it). But I maintain that list(old) is also magical in an important but less obvious way: it's not obvious that list() returns a copy of the list.

One of the understandings of list() that I think you can reasonably have is that it's a function that turns things into lists, if the thing is plausibly list-like in the first place (otherwise it fails). In this view, if you call list() on something that's already a list it's perfectly sensible for list() to return it as-is; you asked list() to make it into a list, and list() is saying 'here you go, it already is'. In fact you may want this behavior.

If you are just learning Python, it is far from obvious that this understanding of list() is in fact wrong and it is guaranteed to make a copy of old. Further, there is no straightforward semantic reason that list() returns a copy; it does so ultimately because that's how it's documented to work. If list()' is an 'intuitive' way of making copies of lists, it's a misleading intuition. If you relied on it, you've just gotten lucky.

(Note that there are list()-like things which do not make copies.)

I'm not going to claim that old[:] is any more intuitive a way of making copies than list(old). I am just going to claim that it is not any less intuitive; in either case, you really have to learn this as an idiom. I personally prefer old[:], both because it looks more clearly magical and because I think it signals more strongly that a copy is being made (since slicing generally has to make new things).

(Note that if you have an arbitrary indexable object, there is no strong semantic requirement that old[:] not return old (and in fact tuples behave this way). There probably is a cultural expectation that it not do so if old is a mutable object, but that's because 'old[:]' has become the general Python idiom for 'make me a copy of this indexable thing'.)

Sidebar: a slight dive into the semantics of list()

If you know Python at a decent level, some of what I wrote above is making you frown. This is because list() is not actually a function; instead it is a class aka a type. Once you know that list() is actually a type, you have a much stronger case for expecting list(old) to return a copy; all classes/types return new instances when called (unless they go well out of their way to do otherwise).

Since this behavior is strongly embedded in what classes do and how they operate, I expect that people do assume as a matter of course that all classes return new instances even if they're called (validly) with existing instances of the class as initializers. The times when people are okay with some variant of singletons are when this creates almost no semantic difference, ie when the class instances are immutable once created.

(And now I think I have wandered far enough down this particular rathole for one entry.)


Comments on this page:

From 213.152.246.251 at 2011-11-11 03:11:54:

Personally I disagree. I'm pretty new to python, but to me, list() looks like an int() of float() conversion. If a list goes in i'd fully expect a list to come out, just as id expect int(5) to be a numeric 5.

On the other hand [:] is pretty confusing. Once i'd read what it did its now not much of a problem, but speaking as a beginner, list() is definitely the more obvious in my mind.

From 82.247.112.90 at 2011-11-11 04:06:29:

Chris, I think that your post summarizes the situation extremely well. I completely adhere to your remarks. Only one thing makes me uneasy with the `list(old)` version: `list` would ideally be spelled `List`, like most class names (if PEP 8 is followed): this would make more explicit the fact that `list` is a class and not a function. I would be happy with `List(old)`, while `list(old)` indeed raises the question of whether `list` is a function or a class.

By cks at 2011-11-15 12:23:54:

I agree that it's clear that list() gives you a list back; as you say, it's just like an int() or float() conversion. Where I think it's not clear is whether or not you're sure to get a different list back.

(You are, but that's because it's documented that way. People who are just starting probably do not read 'help(list)' carefully.)

Written on 11 November 2011.
« Praise for systemd
(Not) parsing wikitext »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Nov 11 01:01:10 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.