2011-07-11
mxiostat
: second generation better Linux disk IO statistics
A while back I wrote and put up xiostat
, a program
that I wrote to give us a faithful recounting of the Linux kernel's
disk IO stats after we discovered the problems with
iostat's numbers. Mxiostat is the second generation of
xiostat, in part because it will report on multiple disks at once (hence
the 'm', for 'multiple').
Because it is now 2011 instead of 2006, I have put the mxiostat source code into git and published it on github as siebenmann/mxiostat. It's a bit minimal and needs some work (like a manual page), which I may or may not get around to someday. Usage information is in the comments at the top of mxiostat.py; information on what the stats are is in comments at the bottom of mxiostat.py and in DiskIOStats.
(Someday I may have a page for it on here too, but we'll see. It's pretty tempting to just use github for all of that.)
Some things to think about when doing polymorphic WSGI
Yesterday I wrote about a WSGI version of cat
, but
I left out some practical considerations (sometimes I write entries a
little bit too fast). In fact these issues are common to all uses of
what I've called 'polymorphic WSGI' (and I've alluded to one of them
before in passing).
The first complication in a WSGI cat implementation is that you need
to figure out how to create and load your WSGI application. As I've
become aware, there's actually sort of a standard for this (courtesy of
Apache's mod_wsgi if nothing else); you have a chunk of Python code
in a file that, when loaded, defines an 'application
' object in its
namespace. Creating this file is your job as the application writer, but
once you have it you can just feed it to wsgi-cat as an argument.
(DWiki vastly predates me becoming aware of this, so it has its own application specific system for configuring its WSGI application interface.)
The second complication is that a WSGI application may care about a
lot more of the HTTP request than just the URL; for example, what the
Host:
header is may matter (and in sophisticated environments, you
may need to set cookies and other things). In theory you can supply
all of these with command line arguments to your wsgi-cat program;
in practice your program really wants to have sensible defaults for
your own environment just so that you don't have to invoke it with
a pile of arguments all of the time. What those additional bits of
information you need are going to depend on your specific application or
WSGI framework, but in general the more sophisticated the framework the
more random bits of the HTTP request it's probably going to care about.
The overkill solution for this is to capture a full WSGI environment
from a real browser request and then use it as the default environment
(with suitable things modified).
(On the other hand, you actively want to turn off some things by omitting them from the claimed HTTP headers; for example, you probably don't want your wsgi-cat to give you gzip'd output by default.)
You may also want options to fake things like https-based requests in addition to plain HTTP ones. (Locally I'd also want to support HTTP Basic Authentication, or at least the Apache environment variables for it, but that's a peculiarity of our setup that's probably not applicable for most people.)