WSGI versus asynchronous servers
Asynchronous servers and frameworks are a popular way to create highly scalable systems. Although WSGI isn't explicitly designed to support them, putting a WSGI application in an asynchronous server isn't totally foolish: many WSGI applications won't be doing anything that can block.
(Technically disk IO can block, but Python on Unix doesn't have any way to do asynchronous disk IO without using threads.)
However, there is one serious fly in the ointment: the WSGI spec
requires a synchronous interface for reading the HTTP request body. You
get it from wsgi.input
, which is specified to be a file-like object.
The spec suggests one way around this: the WSGI server can read the request body from the network (doing so asynchronously) and buffer it all up before invoking the WSGI application. I'm not very fond of this because it makes defending against certain sorts of denial of service attacks much more difficult, as the WSGI server has no idea what the size and time limits of the WSGI application are.
(For example, DWiki rejects all POSTs over 64K without even trying to read them.)
This may seem nit-picky, but building resilient servers is already hard enough that I'm nervous about adding more obstacles.
This is one of those situations when continuations or coroutines would
be pretty handy; the wsgi.input
object could use one or the other to
put the entire WSGI application to sleep until more network input showed
up. (Python's yield
-based coroutines aren't good enough because they
only work with direct function calls; the wsgi.input.read()
method
function can't use yield
to pop all the way back to the WSGI server.)
(I don't fault WSGI for not working easily in asynchronous servers; it's hard to design general interfaces that do, and they're not very natural for synchronous servers. WSGI is sensibly designed for the relatively common case.)
|
|