Some problems with iostat on Linux

March 14, 2006

I was recently reminded that Linux's iostat command is what I call 'overly helpful'. It's not that it lies to you, exactly; it's that iostat is a little bit too eager to please people reading its output, particularly iostat -x output. There are a number of issues.

(The rest of this assumes that you're familiar with the iostat -x fields.)

The fatally flawed field is svctm, the 'average service time' for IOs. It would be really nice to have this number, but unfortunately the kernel does not provide it; instead, iostat makes it up from other numbers using inaccurate assumptions, including that your disk only handles one request at a time.

The kernel accumulates statistics on a running basis; iostat derives per-second numbers by taking snapshots and computing the delta between them. Sometimes (usually under high load) some of the kernel statistics will effectively run backwards, with the new reading having smaller values than the old one. When this happens, iostat doesn't really notice. If you're lucky, the displayed stats are obviously wrong.

The rkB/s and wkB/s fields are redundant; they are literally just rsec/s and wsec/s divided by two. You might ask 'what if the device doesn't have 512 byte sectors?', and the answer is it doesn't matter; the general kernel IO system assumes 512-byte sectors, and in fact the kernel only reports sector information.

Iostat accurately documents %iowait as:

Show the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.

However, note that this is not the same thing as 'the percentage of time one or more processes were waiting on IO', since there are a number of background kernel activities that can queue IO while processes are idle waiting for unrelated things.

Update, April 25th: it turns out that the iostat manpage is wrong about what %iowait measures. See LinuxIowait for details.

PS: In Debian, Fedora Core, and I believe Red Hat Enterprise the iostat command, manpage, etc is part of the sysstat package (RPM, .deb, etc).

Written on 14 March 2006.
« Preparing a high load web mirror setup
The aftermath of our mirroring »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Mar 14 02:16:52 2006
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.