Wandering Thoughts archives

2015-03-06

The simple way CPython does constant folding

When I looked into CPython's constant folding as part of writing yesterday's entry, I expected to find something clever and perhaps intricate, involving examining the types of constants and knowing about special combinations and so on. It turns out that CPython doesn't bother with all of this and it has a much simpler approach.

To do constant folding, CPython basically just calls the same general code that it would call to do, say, '+' on built in types during execution. There is no attempt to recognize certain type combinations or operation combinations and handle them specially; any binary operation on any suitable constant will get tried. This includes combinations that will fail (such as '1 + "a"') and combinations that you might not expect to succeed, such as '"foo %d" % 10'.

(Note that there are some limits on what will get stored as a new folded constant, including that it's not too long. '"a" * 1000' won't be constant folded, for example, although '"a" * 5' will.)

What is a suitable constant is both easy and hard to define. The easy definition is that a suitable constant is anything that winds up in func_code.co_consts and so is accessed with a LOAD_CONST bytecode instruction. Roughly speaking this is any immutable basic type, which I believe currently is essentially integers, strings, and floating point numbers. In Python 3, tuples of these types will also be candidates for taking part in constant folding.

At first this approach to constant folding seemed alarming to me, since CPython is calling general purpose evaluation code, code that's normally used much later during bytecode execution. But then I realized that CPython is only doing this in very restricted circumstances; since it's only doing this with a very few types of immutable objects, it knows a lot about what C code is actually getting called here (and this code is part of the core interpreter). The code for these basic types has an implicit requirement that it can also be called as bytecode is being optimized, and the Python authors can make sure this is respected when things change. This would be unreasonable for arbitrary types, even arbitrary C level types, but is perfectly rational here and is beautifully simple.

In short, CPython has avoided the need to write special constant folding evaluation code by making sure that its regular evaluation code for certain basic types can also be used in this situation and then just doing so. In the process it opened up some surprising constant folding opportunities.

(And it can automatically open up more in the future, since anything that winds up in co_consts is immediately a candidate for constant folding.)

Sidebar: What happens with tuples in Python 2

In the compiled bytecode, tuples of constant values do not actually start out as constants; instead they start out as a series of 'load constant' bytecodes followed by a BUILD_TUPLE instruction. Part of CPython's peephole optimizations is to transform this sequence into a new prebuilt constant tuple (and a LOAD_CONST instruction to access it).

In Python 2, the whole peephole optimizer apparently effectively doesn't reconsider the instruction stream after doing this optimization. So if you have '(1, 2) + (3, 4)' you get a first transformation to turn the two tuples into constants, but CPython never goes on to do constant folding for the + operation itself; by the time + actually has two constant operands, it's apparently too late. In Python 3, this limitation is gone and so the + will be constant folded as well.

(Examining __code__.co_consts in Python 3 shows that the intermediate step still happens; the co_consts for a function that just has a 'return' of this is '(None, 1, 2, 3, 4, (1, 2), (3, 4), (1, 2, 3, 4))', where we see the intermediate tuples being built before we wind up with the final version. In general constant folding appears to leave intermediate results around, eg for '10 + 20 + 30 + 40' you get several intermediate constants as well as 100.)

CPythonConstantFolding written at 02:20:30; Add Comment

2015-03-05

An interesting excursion with Python strings and is

Let's start with the following surprising interactive example from @xlerb's tweet:

>>> "foo" is ("fo" + "o")
True
>>> "foo" is ("fo".__add__("o"))
False
>>> "foo" == ("fo".__add__("o"))
True

The last two case aren't surprising at all; they demonstrate that equality is bigger than mere object identity, which is what is tests (as I described in my entry on Python's two versions of equality). The surprising case is the first one; why do the two sides of that result in exactly the same object? There turn out to be two things going on here, both of them quite interesting.

The first thing going on is that CPython does constant folding on string concatenation as part of creating bytecode. This means that the '"fo" + "o"' turns into a literal "foo" in the actual bytecodes that are executed. On the surface, this is enough to explain the check succeeding in some contexts. To make life simpler while simultaneously going further down the rabbit hole, consider a function like the following:

def f():
  return "foo" is ("fo"+"o")

Compiled functions have (among other things) a table of strings and other constants used in the function. Given constant folding and an obvious optimization, you would expect "foo" to appear in this table exactly once. Well, actually, that's wrong; here's what func_code.co_consts is for this function in Python 2:

(None, 'foo', 'fo', 'o', 'foo')

(It's the same in Python 3, but now it's in __code__.co_consts.)

Given this we can sort of see what happened. Probably the bytecode was originally compiled without constant folding and then a later pass optimized the string concatenation away and added the folded version to co_consts, operating on the entirely rational assumption that it didn't duplicate anything already there. This would be a natural fit for a simple peephole optimizer, which is in fact exactly what we find in Python/peephole.c in the CPython 2 source code.

But how does this give us object identity? The answer has to be that CPython interns at least some of the literal strings used in CPython code. In fact, if we check func_code.co_consts for our function up above, we can see that both "foo" strings are in fact already the same object even though there's two entries in co_consts. The effect is actually fairly strong; for example, the same literal string as in two different modules can be interned to be the same object. I haven't been able to find the CPython code that actually does this, so I can't tell you what the exact conditions are.

(Whether or not a literal string is interned appears to depend partly on whether or not it has spaces in it. This rabbit hole goes a long way down.)

PS: I believe that this means I was wrong about some things I said in my entry on instance dictionaries and attribute names, in that more things get interned than I thought back then. Or maybe CPython grew more string interning optimizations since then.

StringConstantsAndFolding written at 00:22:04; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.