2006-04-10
xiostat
: better Linux disk IO statistics
Xiostat is the program I wrote to give us a faithful recounting of the Linux kernel's disk IO stats after we discovered the problems with iostat's numbers. I've now finally gotten around to putting it up on the web and making a page that explains how to run it and what its output means and so on.
The current version is a bit slapdash, but I have verified that it (still) works on bog standard 2.6 Linux kernels (and thus should work on Debian Sarge, Fedora Core 2+, etc) and Red Hat Enterprise Linux 4. It should also work on RHEL 3, but I don't have any handy RHEL 3 systems to test on right now.
Current information on xiostat's status will always be on the xiostat page, so check there for status updates from after this WanderingThoughts entry has been published.
(I am so habituated to using xiostat
that when I wrote the original
iostat entry I kept automatically typing 'xiostat'
instead of 'iostat'.)
The fun and charm of quoting URLs properly
The fun and charm of URL quoting is that you need to do it twice. Differently. That's because there's two different entities involved: browsers and web servers.
Strictly speaking, about the only thing that you have to quote for the browser is quote characters, because otherwise your nice <a href="..."> comes out very confusing. If you are being a good web standards monkey you need to quote at least ampersands (&'s) as well, because otherwise the browser may take them as entity references. The HTML 4.01 spec in section 5.3.2 recommends also quoting '>', just in case.
(In practice, no browser pays any attention to anything except a truly valid entity reference, because practically everyone except the obsessively standards compliant has unescaped &'s flying around.)
Web servers are startlingly liberal, so the only things you really have to quote is space characters (as either %20 or '+', depending on context) and the percent character itself. RFC 2396 has an additional list or two of stuff that should also be quoted (in sections 2.4.3 and 2.2), like quotes, and some web servers are picky.
(And if you are unlucky enough to deal with a joker who embedded URL component separator characters like '?' or '&' into his paths, you'll have to quote them too.)
You quote things for the browser with entity encoding, so & turns into &. You quote things for the web server with percent encoded hex character values, so a quote turns into %22 and the browser ignores it too. In theory a neurotic application like DWiki that gets handed a URL with a quote should encode it as " so it survives the browser and gets passed as is to the web server for the web server to puke on if desired; in practice, DWiki just encodes quotes in URLs straight to %22s.
Also in practice, many browsers will perform all of the necessary percent encoding for the web server themselves, turning spaces into %20 and so on, and you only need to worry about getting it to the browser. The one gotcha is that browsers often trim trailing spaces, which might be a necessary part of the URL. Doing more quoting is friendlier to simplistic HTML parsing applications.
(This entry is brought to you by me getting curious about the technical requirements of this all during an online discussion with friends.)