2017-03-18
Part of why Python 3.5's await
and async
have some odd usage restrictions
Python 3.5 added a new system for coroutines and asynchronous
programming, based around new async
and await
keywords (which
have the technical details written up at length in PEP 492). Roughly speaking, in
terms of coroutines implemented with yield from
, await
replaces 'yield from
' (and is
more powerful). So what's async
for? Well, it marks a function
that can use await
. If you use await
outside an async
function,
you'll get a syntax error. Functions marked async
have some odd
restrictions, too, such as that you can't use yield
or yield
from
in them.
When I described doing coroutines with yield from
here, I noted that it was potentially error
prone because in order to make everything work you had to have an
unbroken chain of yield from
from top to bottom. Break the chain
or use yield
instead of yield from
, and things wouldn't work.
And because both yield from
and yield
are used for regular
generators as well as coroutines, it's possible to slip up in
various ways. Well, when you introduce new syntax you can fix
issues like that, and that's part of why async
and await
have their odd rules.
A function marked async
is a (native) coroutine. await
can only
be applied to coroutines, which means that you can't accidentally
treat a generator like a coroutine the way you can with yield
from
. Simplifying slightly, coroutines can only be invoked through
await
; you can't call one or use them as a generator, for example
as 'for something in coroutine(...):
'. As part of not being
generators, coroutines can't use 'yield
' or 'yield from
'.
(And there's only await
, so you avoid the whole 'yield
' verus
'yield from
' confusion.)
In other words, coroutines can only be invoked from coroutines and
they must be invoked using the exact mechanism that makes coroutines
work (and that mechanism isn't and can't be used for or by anything
else). The entire system is designed so that you're more or less
forced to create that unbroken chain of await
s that makes it all
go. Although Python itself won't error out on import
time if you
try to call a async
function without await
(it just won't work
at runtime), there's probably Python static checkers that look for
this. And in general it's an easy rule to keep track of; if it's
async
, you have to await
it, and this status is marked right
there in the function definition.
(Unfortunately it's not in the type of the function, which means
that you can't tell by just importing the module interactively and
then doing 'type(mod.func)
'.)
Sidebar: The other reason you can only use await
in async
functions
Before Python 3.5, the following was completely valid code:
def somefunc(a1, b2): ... await = interval(a1, 10) otherfunc(b2, await) ...
In other words, await
was not a reserved keyword and so could be
legally used as the name of a local variable, or for that matter a
function argument or a global.
Had Python 3.5 made await
a keyword in all contexts, all such
code would immediately have broken. That's not acceptable for a
minor release, so Python needed some sort of workaround. So it's
not that you can't use await
outside of functions marked async
;
it's that it's not a keyword outside of async functions. Since
it's not a keyword, writing something like 'await func(arg)
' is
a syntax error, just as 'abcdef func(arg)
' would be.
The same is true of async
, by the way:
def somefunc(a, b, async = False): if b == 10: async = True ....
Thus why it's a syntax error to use 'async for
' or 'async with
'
outside of an async
function; outside of such functions async
isn't even a keyword so 'async for
' is treated the same as 'abcdef
for
'.
(I'm sure this makes Python's parser that much more fun.)
2017-03-16
How we can use yield from
to implement coroutines
Give my new understanding of generator functions and yield from
, we can now see how to use yield
from
to implement coroutines and an event loop. Consider a three
level stack of functions, where on the top layer you have an event
loop, in the middle you have the processing code you write, and on
the bottom are event functions like wait_read()
or sleep()
.
Let's start with an example processing function or two:
def countdown(n): while n: print("T-minus", n) n -= 1 yield from sleep(1) def launch(what, nsecs): print("counting down for", what) yield from countdown(nsecs) print("launching", what)
To start a launch, we call something like 'coro.start(launch("fred",
10))
', which looks a bit peculiar since it sort of seems like
coro.start()
should get control only after the launch. However,
we already know that calling a generator function doesn't do exactly
what it looks like. What coro.start()
gets when we do this is an
unstarted generator object (which handily encapsulates those
arguments to launch()
, so we don't have to do it by hand).
When the coroutine scheduler starts the launch()
generator object,
we wind up with a chain of yield from
s that bottoms out at
sleep()
. What sleep()
yields is passed back up to the coroutine
scheduler and the entire call chain is suspended; this is no different
that what I did by calling .send()
by hand yesterday. What sleep()
returns to the
scheduler is an object (call it an event object) that tells the
coroutine scheduler under what conditions this coroutine should be
resumed. When the scheduler reaches the point that the coroutine
should be run again, the scheduler will once again call .send()
,
which will resume execution in sleep()
, which will then return
back to countdown()
, and so on. The scheduler may use this .send()
to pass information back to sleep()
, such as how long it took
before the coroutine was restarted.
Here yield
and yield from
are being used for two things. First,
they create a communication channel between the coroutine scheduler
and the low-level event functions like sleep()
. Our launch()
and countdown()
functions are oblivious to this since they don't
touch either the value sleep()
yields up to the scheduler or the
value that the scheduler injects to sleep()
with .send()
. Second,
the chain of yield from
and the final yield
neatly suspend the
entire call stack.
In order for this to work reliably, there are two rules that our
user-written processing functions have to follow. First, they must
never accidentally attempt to do anything with the sleep()
generator
function. It is okay but unclear for a non-generator function to
call sleep()
and return the result:
def sleep_minutes(n): return sleep(n * 60) def long_countdown(n): while n: print("T-minus", n, "minutes") yield from sleep_minutes(1) n -= 1
This is ultimately because 'yield from func()
' is equivalent to
't = func(); yield from t
'. We don't care just how the generator
object got to us so we can yield from
it, we just care that it
did.
However, at no stage in our processing functions can we attempt to
look at the results of iterating sleep()
's generator object, either
directly or indirectly by writing, say, 'for i in countdown(10):
'.
This rules out certain patterns for writing processing functions, for
instance this one:
def label_each_sec(label, n): for _ in tick_once_per_sec(n): print(label)
This leads to the second rule, which is that we must have an
unbroken chain of yield from
s from the top to the bottom of our
processing functions, right down to where you use an event function
such as sleep()
. Each function must 'call' the next using the
'yield from func()
' idiom. In effect we don't have calls from one
processing function to another; instead we're passing control from
one function to the next. In my example, launch()
passes control
to countdown()
until the countdown expires (and countdown()
passes control to sleep()
). If we actually call a processing
function normally or accidentally use 'yield
' instead of 'yield
from
', the entire collection explodes into various sorts of errors
without getting off the launch pad and you will not go to space
today.
As you might imagine, this is a little bit open to errors. Under
normal circumstances you'll catch the errors fairly fast (when
your main code doesn't work). However, since errors can only be
caught at runtime when a non-yield from
code path is reached,
you may have mistakes that lurk in rarely executed code paths.
Perhaps you have a rarely invoked last moment launch abort:
def launch(what, nsecs): print("counting down for", what) yield from countdown(nsecs) if launch_abort: print("Aborting launch! Clear the launch pad for", what) yield sleep(1) print("Flooding fire suppression ...") else: print("launching", what)
It might be a while before you discovered that mistake (I'm doing
a certain amount of hand-waving about early aborts in countdown()
).
(See also my somewhat related attempt at understanding this sort
of thing in a Javascript context in Understanding how generators
help asynchronous programming.
Note that you can't use my particular approach from that entry in
Python with 'yield from
' for reasons beyond the scope of this
entry.)
2017-03-15
Sorting out Python generator functions and yield from
in my head
Through a chain of reading, I wound up at How the heck does
async/await work in Python 3.5? (via
the Trio tutorial).
As has happened before when I started reading about Python 3's new
async
and await
stuff, my head started hurting when I hit the
somewhat breezy discussion of yield from
and I felt the need to
slow down and try to solidly understand this, which I haven't really
before.
Generator functions are functions that contain a yield
statement:
def fred(a): r = yield 10 print("fred got:", r) yield a
A straightforward generator function is there to produce a whole series of values without having to ever materialize all of them at once in a list or the like.
Calling a generator function does not return its result. Instead, it returns a generator object, which is a form of iterator:
>>> fred(10) <generator object fred at 0x7f52a75ea1a8>
This generator object is in part a closure that captures the argument
fred()
was called with (and in general will preserve fred()
's
state while it is repeatedly iterated). Note that fred()
's code
doesn't start executing until you try to get the first value from
the iterator.
One common pattern with a stack of generator functions (including
needing to modify or filter part of a generator's results) was that you would have one generator
function that wanted to call another one for a while. In the beginning
this was done with explicit for
loops and the like, but then
Python added yield from
. yield from
takes a generator or iterator
and exhausts it for you, repeatedly yield
'ing the result.
def barney(a): yield from fred(a)
(You can intermix yield
and yield from
and use both of them
more than once in a function.)
Because generator functions actually return generators, not any
sort of result, 'yield from func()
' is essentially syntactic sugar
for calling the function, getting a generator object back, and then
calling yield from
on the generator object. There is no special
magic involved in that:
def barney(a): gen = fred(a) yield from gen
Because generator objects are ordinary objects, they can be returned through functions that are not generators themselves, provided that intermediate functions don't really attempt to manipulate them and simply return them as-is:
def jim(a): return fred(a) def bob(a): yield from jim(a)
(If jim()
actually iterated through fred()
's results, things
would quietly go sideways in ways that might or might not be visible.)
When yield
started out, it was a statement; however, that got
revised so that it was an expression and could thus have a value,
as we see in fred()
where the value of one yield
is assigned
to r
and then used later.
You (the holder of the generator object) inject that value by
calling .send()
on the generator:
>>> g = fred(2) >>> _ = g.send(None); _ = g.send("a") fred got: a
(The first .send()
starts the generator running and must be made
with a None
argument.)
As part of adding yield from
, Python arranged it so that if you
had a stack of yield from
invocations and you called .send()
on the outer generator object, the value you sent did not go to
the outer generator object; instead it goes all the way down to
the eventual generator object that is doing a yield
instead of
a yield from
.
def level3(a): # three levels of yield from # and we pass through a normal # functions too yield from bob(a) >>> g = level3(10) >>> _ = g.send(None); _ = g.send("down there") fred got: down there
This means that if you have a stack of functions that all relay
things back up using 'yield from
', you have a direct path from
your top level code (here that's our interactive code where we
called level3()
) all the way down to the core generator function
at the bottom of the call stack (here, the fred()
function). You
and it can communicate with each other through the values it yield
s
and the values you send()
to it without any function in the middle
having to understand anything about this; it's entirely transparent
to them.
(Don't accidentally write 'yield
' instead of 'yield from
',
though. The good news about that mistake is that you'll catch
it fast.)
Hopefully writing this has anchored yield from
's full behavior
and the logic behind it sufficiently solidly in my head that it
will actually stick this time around.
Sidebar: yield from
versus yield
of a generator
Suppose that we have a little mistake:
def barney2(a): yield fred(a)
What happens? Basically what you'd expect:
>>> list(barney(20)) fred got: None [10, 20] >>> list(barney2(20)) [<generator object fred at 0x7f0fdf4b9258>]
When we used yield
instead of yield from
, we returned a value
instead of iterating through the generator. The value here is what
we get as the result of calling fred()
, which is a generator
object.
By the way, a corollary to strings being iterable is that accidentally calling 'yield
from
' instead of 'yield
' on a string won't fail the way that eg
'yield from 10
' does but will instead give you a sequence of
single characters. You'll probably notice that error fairly fast,
though.
This behavior of yield from
is pretty much a feature, because it
means that you can yield from
another function without having to
care about whether it's an actual generator function or it merely
returns an iterable object of some sort; either will work.