The simple way CPython does constant folding
When I looked into CPython's constant folding as part of writing yesterday's entry, I expected to find something clever and perhaps intricate, involving examining the types of constants and knowing about special combinations and so on. It turns out that CPython doesn't bother with all of this and it has a much simpler approach.
To do constant folding, CPython basically just calls the same general
code that it would call to do, say, '+
' on built in types during
execution. There is no attempt to recognize certain type combinations or
operation combinations and handle them specially; any binary operation
on any suitable constant will get tried. This includes combinations that
will fail (such as '1 + "a"
') and combinations that you might not
expect to succeed, such as '"foo %d" % 10
'.
(Note that there are some limits on what will get stored as a new
folded constant, including that it's not too long. '"a" * 1000
'
won't be constant folded, for example, although '"a" * 5
' will.)
What is a suitable constant is both easy and hard to define. The easy
definition is that a suitable constant is anything that winds up in
func_code.co_consts
and so is accessed with a LOAD_CONST
bytecode instruction. Roughly speaking this is any immutable basic
type, which I believe currently is essentially integers, strings,
and floating point numbers. In Python 3, tuples of these types will
also be candidates for taking part in constant folding.
At first this approach to constant folding seemed alarming to me, since CPython is calling general purpose evaluation code, code that's normally used much later during bytecode execution. But then I realized that CPython is only doing this in very restricted circumstances; since it's only doing this with a very few types of immutable objects, it knows a lot about what C code is actually getting called here (and this code is part of the core interpreter). The code for these basic types has an implicit requirement that it can also be called as bytecode is being optimized, and the Python authors can make sure this is respected when things change. This would be unreasonable for arbitrary types, even arbitrary C level types, but is perfectly rational here and is beautifully simple.
In short, CPython has avoided the need to write special constant folding evaluation code by making sure that its regular evaluation code for certain basic types can also be used in this situation and then just doing so. In the process it opened up some surprising constant folding opportunities.
(And it can automatically open up more in the future, since anything
that winds up in co_consts
is immediately a candidate for constant
folding.)
Sidebar: What happens with tuples in Python 2
In the compiled bytecode, tuples of constant values do not actually
start out as constants; instead they start out as a series of 'load
constant' bytecodes followed by a BUILD_TUPLE
instruction. Part of
CPython's peephole optimizations is to transform this sequence into a
new prebuilt constant tuple (and a LOAD_CONST
instruction to access
it).
In Python 2, the whole peephole optimizer apparently effectively doesn't
reconsider the instruction stream after doing this optimization. So if
you have '(1, 2) + (3, 4)
' you get a first transformation to turn the
two tuples into constants, but CPython never goes on to do constant
folding for the +
operation itself; by the time +
actually has
two constant operands, it's apparently too late. In Python 3, this
limitation is gone and so the +
will be constant folded as well.
(Examining __code__.co_consts
in Python 3 shows that the
intermediate step still happens; the co_consts
for a function that
just has a 'return
' of this is '(None, 1, 2, 3, 4, (1, 2), (3, 4),
(1, 2, 3, 4))
', where we see the intermediate tuples being built before
we wind up with the final version. In general constant folding appears
to leave intermediate results around, eg for '10 + 20 + 30 + 40
' you
get several intermediate constants as well as 100.)
|
|