The periodic strangeness of idiomatic Python

July 29, 2012

Suppose that you want to do something N times, for whatever reason. In C, the straightforward and idiomatic way to do this is a for loop; 'for (i = 0; i < times; i++) { .... }'. Since Python doesn't have this form of a for loop, the Python equivalent is a while loop. However, many people would probably say that this isn't idiomatic Python. What I think of as the idiomatic Python way to do 'do something N times' is:

for _ in range(0, times):
  ....

(Some people will use xrange() instead of range() here.)

This is certainly what instantly popped into my head when I ran into this situation recently and at first I didn't think any more of it. But once I began actually looking at this it started getting stranger and stranger, less like a clear language idiom and much more like a convention. Let me run down a number of the ways that this is strange:

  • It's a rather indirect way of expressing 'do something N times'. The C for loop is pretty direct by contrast.

    (With that said, I'm not sure a while loop would be that much more direct. The directness advantage that C has is that all parts of the for loop's control are there in one chunk; a while loop spreads them out in three different lines.)

  • We're doing things in this odd way partly to use as many builtins as possible, often in the name of (nominal) efficiency. Yes, this avoids a couple of extra lines to initialize and increment an otherwise unused counter, but I don't think that really makes it clearer.
  • In the pursuit of this idiom we're creating a list or at least an iterator and walking it, throwing away the result. In many languages this would be wince-inducingly inefficient (or at least much worse than basic integer arithmetic with a variable). It's a (probable) win in CPython because of the whole builtins vs non-builtins issue.

    (Not only is range() a builtin, but for with iterators has direct bytecode support.)

  • You pretty much need to know this idiom in order to understand this code without a bunch of thought (which is not the case for the C version). A special tricky point is the use of `_' as a special variable name used to indicate 'I don't care about this variable, I just have to have something here'; this is entirely a convention in (some) Python programming circles, with no special meaning in the language itself.

    (As a corollary, I doubt that this is an idiom that would naturally occur to people who are not already immersed in Python.)

  • When using this idiom you'd better remember the exact effects of range()/xrange(), since eg 'range(1, times)' is very much not what you want.

    (Again the C equivalent has this clearly visible.)

The overall summary of this is that the Python idiom really is close to being an idiom, in the literal definition of the word: it is an expression whose meaning is not clearly and immediately understandable from a quick read of its component parts. By contrast the C idiom is much clearer (at least for me).

(I don't think that all of this makes the Python idiom bad; it remains the most compact and probably the most efficient way of expressing this. And even without knowing this idiom off the top of your head I think it's reasonably clear roughly what it does (and it's reasonably easy to work out all of the details).)


Comments on this page:

Discussion of the most efficient way to loop N times: http://rhodesmill.org/brandon/2012/counting-without-counting/

Although that is embracing idioms rather than clarity.

David B.

I believe _ is a conventional (or even official) "unused" variables in other languages; I believe I've seen it used in ML (or at least Ocaml) for unused elements in pattern matches. (Sometimes you want to discard multiple slots of a matched tuple, which means you use _ multiple times in the same statement; since normally this would mean binding a variable multiple times, I suspect it's actually a language feature, but I'm positive.) And I suspect I've seen it in other languages.

I'm not sure how you can call the Python conventional not idiomatic but consider the C version idiomatic. I don't think the idea of "repeat something 10 times" naturally involving a counter going from 0 to 9, or 1 to 10, is natural or idiomatic at all. (To some extent this is semantics; I certainly do consider the C approach idiomatic, but I think that just means I'm using the word "idiom" in a different way.)

Certainly it is a property of C that idioms are easier to decipher, because the mapping from C operators and statements to underlying behaviors is always so trivial. But I think that's more just something inherent to C vs higher-level languages, as opposed to something specific about Python counting vs C counting.

(Also note that DWiki's inconsistent behavior with underscores versus asterisks and other style markers means that a bare underscore surrounded by whitespace is still treated as starting typewriter text, and despite the explanation in the docs I can't really see any practical reason for the behavior. Also, I could not find any documented way to escape them. Inconsistently, double underscore appears to output as double underscore, not single or empty. I used the ".pn no" system to disable them entirely, but this seems crazy to have to enable/disable rather than having a one-off mechanism; also way too much effort for someone commenting on a blog entry to have to go to. Why not just allow \-escaping (possibly with some other character, given the existing weird use of \).)

I have to admit, I don't find the C version that natural either. To get natural you need to go for the logo version:

 repeat 4 [forward 50 right 90]

Also, I think that the idiomatic python version is:

for _ in range(times):
  ....

(Or xrange)

-- DanielMartin

"_" is idiomatic for a variable you don't care about in Haskell as well.

You pretty much need to know this idiom in order to understand this code without a bunch of thought (which is not the case for the C version).

I don't find this: "for (i = 0; i < times; i++) { .... }" idiomatic at all ... unless you are familiar with a language that writes it's loops that way. Theoretically you can reason through it, but to do that you have to know that the first statement is a declaration executed only once, the 2nd a conditional tested once per loop, and the 3rd a normal statement executed once per loop. Not really any more intuitive than Python's version. I'm with the posters above, unless the code says "repeat" it's always going to be a bit strange. Not that that's a problem.

My reply about my view on understanding idioms got long enough that I turned it into an entry, IdiomUnderstandability.

These are my WanderingThoughts
(About the blog)

GettingAround
Full index of entries
Recent comments

This is part of CSpace, and is written by ChrisSiebenmann.
Twitter: @thatcks

* * *

Atom feeds are available; see the bottom of most pages.

This is a DWiki.
(Help)

Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web

Search:
Written on 29 July 2012.
(Previous | Next)

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jul 29 01:21:56 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.