## The periodic strangeness of idiomatic Python

July 29, 2012

Suppose that you want to do something N times, for whatever reason. In C, the straightforward and idiomatic way to do this is a `for` loop; '`for (i = 0; i < times; i++) { .... }`'. Since Python doesn't have this form of a `for` loop, the Python equivalent is a `while` loop. However, many people would probably say that this isn't idiomatic Python. What I think of as the idiomatic Python way to do 'do something N times' is:

```for _ in range(0, times):
....
```

(Some people will use `xrange()` instead of `range()` here.)

This is certainly what instantly popped into my head when I ran into this situation recently and at first I didn't think any more of it. But once I began actually looking at this it started getting stranger and stranger, less like a clear language idiom and much more like a convention. Let me run down a number of the ways that this is strange:

• It's a rather indirect way of expressing 'do something N times'. The C `for` loop is pretty direct by contrast.

(With that said, I'm not sure a `while` loop would be that much more direct. The directness advantage that C has is that all parts of the `for` loop's control are there in one chunk; a `while` loop spreads them out in three different lines.)

• We're doing things in this odd way partly to use as many builtins as possible, often in the name of (nominal) efficiency. Yes, this avoids a couple of extra lines to initialize and increment an otherwise unused counter, but I don't think that really makes it clearer.
• In the pursuit of this idiom we're creating a list or at least an iterator and walking it, throwing away the result. In many languages this would be wince-inducingly inefficient (or at least much worse than basic integer arithmetic with a variable). It's a (probable) win in CPython because of the whole builtins vs non-builtins issue.

(Not only is `range()` a builtin, but `for` with iterators has direct bytecode support.)

• You pretty much need to know this idiom in order to understand this code without a bunch of thought (which is not the case for the C version). A special tricky point is the use of ``_`' as a special variable name used to indicate 'I don't care about this variable, I just have to have something here'; this is entirely a convention in (some) Python programming circles, with no special meaning in the language itself.

(As a corollary, I doubt that this is an idiom that would naturally occur to people who are not already immersed in Python.)

• When using this idiom you'd better remember the exact effects of `range()`/`xrange()`, since eg '`range(1, times)`' is very much not what you want.

(Again the C equivalent has this clearly visible.)

The overall summary of this is that the Python idiom really is close to being an idiom, in the literal definition of the word: it is an expression whose meaning is not clearly and immediately understandable from a quick read of its component parts. By contrast the C idiom is much clearer (at least for me).

(I don't think that all of this makes the Python idiom bad; it remains the most compact and probably the most efficient way of expressing this. And even without knowing this idiom off the top of your head I think it's reasonably clear roughly what it does (and it's reasonably easy to work out all of the details).)

Discussion of the most efficient way to loop N times: http://rhodesmill.org/brandon/2012/counting-without-counting/

Although that is embracing idioms rather than clarity.

David B.

I believe _ is a conventional (or even official) "unused" variables in other languages; I believe I've seen it used in ML (or at least Ocaml) for unused elements in pattern matches. (Sometimes you want to discard multiple slots of a matched tuple, which means you use _ multiple times in the same statement; since normally this would mean binding a variable multiple times, I suspect it's actually a language feature, but I'm positive.) And I suspect I've seen it in other languages.

I'm not sure how you can call the Python conventional not idiomatic but consider the C version idiomatic. I don't think the idea of "repeat something 10 times" naturally involving a counter going from 0 to 9, or 1 to 10, is natural or idiomatic at all. (To some extent this is semantics; I certainly do consider the C approach idiomatic, but I think that just means I'm using the word "idiom" in a different way.)

Certainly it is a property of C that idioms are easier to decipher, because the mapping from C operators and statements to underlying behaviors is always so trivial. But I think that's more just something inherent to C vs higher-level languages, as opposed to something specific about Python counting vs C counting.

(Also note that DWiki's inconsistent behavior with underscores versus asterisks and other style markers means that a bare underscore surrounded by whitespace is still treated as starting typewriter text, and despite the explanation in the docs I can't really see any practical reason for the behavior. Also, I could not find any documented way to escape them. Inconsistently, double underscore appears to output as double underscore, not single or empty. I used the ".pn no" system to disable them entirely, but this seems crazy to have to enable/disable rather than having a one-off mechanism; also way too much effort for someone commenting on a blog entry to have to go to. Why not just allow \-escaping (possibly with some other character, given the existing weird use of \).)

I have to admit, I don't find the C version that natural either. To get natural you need to go for the logo version:

``` repeat 4 [forward 50 right 90]
```

Also, I think that the idiomatic python version is:

```for _ in range(times):
....
```

(Or `xrange`)

-- DanielMartin

Place the `_` within a `(())` pair.

"_" is idiomatic for a variable you don't care about in Haskell as well.

You pretty much need to know this idiom in order to understand this code without a bunch of thought (which is not the case for the C version).

I don't find this: "for (i = 0; i < times; i++) { .... }" idiomatic at all ... unless you are familiar with a language that writes it's loops that way. Theoretically you can reason through it, but to do that you have to know that the first statement is a declaration executed only once, the 2nd a conditional tested once per loop, and the 3rd a normal statement executed once per loop. Not really any more intuitive than Python's version. I'm with the posters above, unless the code says "repeat" it's always going to be a bit strange. Not that that's a problem.

My reply about my view on understanding idioms got long enough that I turned it into an entry, IdiomUnderstandability.

These are my WanderingThoughts

This is part of CSpace, and is written by ChrisSiebenmann.

* * *

Atom feeds are available; see the bottom of most pages.

This is a DWiki.
(Help)

Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web

Search:
Written on 29 July 2012.
(Previous | Next)