2009-07-26
The anatomy of a hack to get around try:/finally: and generators
Suppose that you have the Python try:/finally: problem and need to get around it, specifically
you need to both be a generator and have your finally: run
immediately. Taken from here,
one answer is:
def foo(bar):
try:
def output():
... compute with bar ...
yield res
...
return output()
finally:
print "finalizing"
(Now, there is a lot of things that this doesn't help with; the actual
computation hasn't finished, so you can't do anything in the finally:
that would destroy its ability to work.)
I find this confusing enough that it's worthwhile walking through why it works. It goes like this:
The output() function is two things; it is both a generator and a
closure. As a closure, it captures the arguments that foo() was
called with, so that it can do computation with them. As a generator,
calling it doesn't run any of its code but instead
causes Python to immediately return an iterator object that is in turn
effectively capturing the output() closure. The foo() function then
returns this iterator object and as part of returning, executes its own
finally: clause.
(The shorter way of putting this is that returning the result of calling
a generator function does not make you a generator function yourself,
and thus your finally: will run normally and immediately.)
Obviously, you don't have to move all of the code into the output()
closure. In some situations you might have a lot of things that you do
before you can return the first result, and the more you do in the main
function the more resources you can release when it returns.
2009-07-25
When code in generators runs
Due to another entry I'm in the process of writing, I was suddenly struck with a question: in Python, when does the code in a generator function start running?
First, the brief version of generators and iterators. To handwave
somewhat, a generator is a function that uses yield to create and
return its results one at a time. Behind the scenes, such a function
actually returns an iterator object, which the Python interpreter uses
to freeze and unfreeze the actual code as the function's code calls
yield and outside code asks for the next value.
There are two plausible answers to the question. First, the generator
function could run the actual function code up until the first time
it called yield and then freeze it, create the iterator object, and
return. Second, the generator function could immediately create the
iterator object and not run any of your function code until someone
asked for the first value.
(You might think that the first answer is crazy, but its advantage is that it makes calling generator functions act normally for as long as possible; their code runs until they do something special.)
The answer is that no code in generator functions runs until someone
asks for the first value. In the extreme case, where the result of
calling the function is just discarded, that means that no code in your
generator is run at all. Note that this includes code in finally:
statements, which is what you'd expect since the flow of control never
reached the try:/finally: block to start with.
(This doesn't seem to be explicitly stated in the Python language reference, but it is explicit in PEP 255, which the language reference points to. I suspect that PEPs are considered more or less officially part of the language reference, and so this is guaranteed behavior.)
If you actually need generators (okay, iterators) that are always
finalized, I believe that you can't use yield and will instead have to
build iterator objects by hand that have __del__ methods.