Wandering Thoughts archives

2009-06-15

try:/finally: and generators

Suppose that you have code that generates an abstract 'list' of some sort and returns it inside a try:/finally: block. There are two common ways to code this; you can return a real list, or you can use yield to be a generator. You might even code it one way and change it to the other later, which is generally a transparent change.

Not this time, though. If you are using finally:, the two options can have quite different behavior. Constructing an example is tedious, but explaining the issue is simple:

Using yield postpones the execution of your finally: blocks from when your results are generated and returned to when your results are used, which may be some time later.

In simple situations you won't notice this, because the results are used immediately after they're returned. In more complex situations you'll probably get mysterious ordering issues about when your finally: statements run, as they'll appear to run well after they 'should' (and when they did when your function wasn't using yield).

(For example, consider a finally: that closes down a database connection. If the result of your database lookup function is just put in a data structure and only looked at later, you could build up a lot more database connections than you expect.)

In fact this is one part of a general issue: when you use yield, all of the objects and resources held alive by your function are only released when your results are used. Effectively yield turns your ordinary function into a closure, with the resulting consequences for potential resource leaks.

(Credit where credit is due department: I was exposed to this issue by Using yield Statements in WSGI Middleware can be Very Harmful.)

TryFinallyAndGenerators written at 01:30:29; Add Comment

2009-06-08

Another way that generators are not lists: modifying them

A long time ago, I wrote some stuff on how generators are not lists (okay, technically it was about iterators), and one of the things that I mentioned is that generators do not have list methods. Well, there's a consequence of that that only struck me recently: you need completely different code to modify a returned generator than to modify a returned list.

Suppose you have a function that returns something that is conceptually a list of items. Further suppose that you have another function that modifies what the first function returns; perhaps you want to add something on the end. If you know you're dealing with a list, you write:

def append(func, extra):
    r = func()
    r.extend(extra)
    return r

If func() is a generator, this code blows up. You have two choices; first, you can forcefully turn the result of func() into a list, and second, you can rewrite append() as a generator (which will work regardless of what func() returns, but may have consequences that make it undesirable):

def append(func, extra):
   for it in (func(), extra):
       for e in it:
           yield e

(Yes, yes, one can write this using itertools.chain(). Then people would have to look it up.)

In either case, you have to actively make a decision about what your function will do. You cannot passively modify whatever you get handed and pass it up to your caller without changing its nature; you must decide that no matter what func() returns, you're either returning a list or an iterator.

(Technically you can, since you can see if you got handed something that follows the iterator protocol or whether it looks sequence-like. But that way lies madness.)

GeneratorListModification written at 23:15:13; Add Comment

By day for June 2009: 8 15; before June; after June.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.