2020-03-05
The problem of Unix iowait and multi-CPU machines
Various Unixes have had a 'iowait' statistic for a long time now
(although I can't find a source for where it originated; it's not in 4.x
BSD, so it may have come through System V and sar
). The traditional
and standard definition of iowait is that it's the amount of time the
system was idle but had at least one process waiting on disk IO. Rather
than count this time as 'idle' (as you would if you had a three-way
division of CPU time between user, system, and idle), some Unixes
evolved to count this as a new category, 'iowait'.
(To my surprise, iowait doesn't appear to be in the *BSDs at all; they stick to the old user, system, idle, and nice divisions of system time. Iowait is in Linux and Solaris/Illumos, and appears to be in HP-UX and AIX as well based on some quick manpage checks.)
This traditional definition makes easy and straightforward sense on a uniprocessor machine, where the system cannot be simultaneously idle waiting for a process to finish IO and running a process. But these days basically all systems are multi-CPU 'SMP' ones, and in a multi-CPU world it's not obvious how you should define iowait, because there's no longer a strict binary division between 'running things' and 'stopped waiting for IO'. In a multi-CPU system, some but not all CPUs can be busy running code, while some processes are blocked on IO. If those processes had IO that completed immediately, they could run on the currently idle CPUs, but at the same time the system is doing some work instead of being entirely stalled waiting for IO to complete (which is the way iowait works on a uniprocessor system).
There are all sorts of plausible answers a Unix could adopt for the meaning of iowait on a multi-CPU system, ranging from the simple to the complex to the ad-hoc. But no matter what a Unix does, it needs to come up with some answer (and ideally document it), and there are no guarantees that two different Unixes have picked the same answer. If you're going to use iowait for much, you might want to try to figure out how your Unix defines it on multi-CPU machines.
(Picking the answer gets more complicated if your Unix wants iowait
to be a per-CPU thing, like user, system, and idle time often are,
because normally waiting for IO is not naturally associated with
any particular CPU. Illumos appears to not consider iowait a per-CPU
thing, per a little mention in the mpstat
manpage; it does have the idea of iowait
in general, per sar(1)
.)