2011-07-10
Exploiting polymorphic WSGI again to create cat
I've written before about how I (ab)use what I call the polymorphic nature of WSGI, where a WSGI application doesn't actually care what environment you run it in; anything that will provide a WSGI environment will do. I've used this before for various things and was recently reminded of yet another trick I've played with WSGI.
Suppose that you have a web application and you want to inspect or
capture a rendered web page (especially if you want to read the raw
HTML). Obviously you can do this from a browser or you can use something
simpler like wcat
, but one day it
struck me: why was I going through the bother of making a web request
(and possibly firing up a test web server) just to get some output from
my WSGI application?
The result is something that you could call 'wsgi-cat' (although for me
it is specific to DWiki). Given a URL on the command line, it connects
up all of the WSGI infrastructure necessary, passes the URL to the WSGI
application (as a GET
request), and then dumps out whatever 'web page'
the application returned (which isn't necessarily HTML; it might be a
HTTP redirection, for example). I haven't yet given it any support for
handling POST
s, but it wouldn't really be hard.
I originally wrote this because I was tired of firing up various things just to look at how I was rendering HTML, but it's turned out to be quietly convenient for any number of things. Looking at HTTP redirects that my application is supposed to generate is one example; many web programs will transparently handle them for you, which is a great convenience right up to the point where you want to inspect them.
(Another potential use is artificially generating error pages or POST
response pages and capturing their HTML in order to validate it. Online
HTML validators are generally GET
-only, which can let validation
errors lurk in the POST
side of things. (Assuming that you care about
validating your HTML at all.))