2009-03-09
What list methods don't make sense for heterogeneous lists
If one is going to use lists for heterogeneous data (per yesterday's entry), it makes sense to ask what list methods don't make sense any more. Opinions will probably differ, but here is my take on it.
First, I think that we can skip all methods that are common between tuples and lists; if tuples have them, they are presumably considered fine for heterogeneous data. Looking at what remains, I see:
.sort()
clearly makes no sense; there is no real ordering among heterogeneous elements..reverse()
doesn't make much sense to me, because if you have heterogeneous data I tend to think that their order is important.
I'm unlikely to use .index()
, .insert()
, .remove()
, or list
multiplication, but I'm not sure they'll never make sense for some ways
of building and manipulating heterogeneous lists. The same is true for
.append()
and .extend()
, and I actually use them in situations where
I accumulate elements instead of creating the list in one big bang.
In thinking about this, I've come to the obvious realization that there are two sorts of heterogeneous lists. Sometimes, nominally heterogeneous lists actually contain conceptually homogeneous data that is just most conveniently represented in different Python types (or, to put it the other way, that you have not bothered to create a class to encapsulate). For instance, in processing a language you might have a list of parser nodes or lexer tokens that have varying representations; at a mechanical level the list is heterogeneous, but at a conceptual level everything in it is the same sort of thing.
With this sort of conceptually homogeneous list, you can use all of the
list methods (even .sort()
, with a custom comparison function) and
have them all make sense, even though in some sense you are mingling
apples and oranges.
Sidebar: finding all of the list-only methods
Here is yet another appreciation of Python's introspection abilities. I decided that I wanted to know the methods that lists didn't share with tuples, so:
t = tuple() l = list() s1 = set(dir(t)) s2 = set(dir(l)) l2 = list(s2 - s1) l2.sort(); print l2
I used actual instances as a precaution, but some experimentation
shows that I didn't need to; you get the same result if you take
the dir()
of list
and tuple
directly.
(Updated: fixed the code to actually sort the list, as pointed out by a commentator. Whoops.)