The problem of Unix iowait and multi-CPU machines

March 5, 2020

Various Unixes have had a 'iowait' statistic for a long time now (although I can't find a source for where it originated; it's not in 4.x BSD, so it may have come through System V and sar). The traditional and standard definition of iowait is that it's the amount of time the system was idle but had at least one process waiting on disk IO. Rather than count this time as 'idle' (as you would if you had a three-way division of CPU time between user, system, and idle), some Unixes evolved to count this as a new category, 'iowait'.

(To my surprise, iowait doesn't appear to be in the *BSDs at all; they stick to the old user, system, idle, and nice divisions of system time. Iowait is in Linux and Solaris/Illumos, and appears to be in HP-UX and AIX as well based on some quick manpage checks.)

This traditional definition makes easy and straightforward sense on a uniprocessor machine, where the system cannot be simultaneously idle waiting for a process to finish IO and running a process. But these days basically all systems are multi-CPU 'SMP' ones, and in a multi-CPU world it's not obvious how you should define iowait, because there's no longer a strict binary division between 'running things' and 'stopped waiting for IO'. In a multi-CPU system, some but not all CPUs can be busy running code, while some processes are blocked on IO. If those processes had IO that completed immediately, they could run on the currently idle CPUs, but at the same time the system is doing some work instead of being entirely stalled waiting for IO to complete (which is the way iowait works on a uniprocessor system).

There are all sorts of plausible answers a Unix could adopt for the meaning of iowait on a multi-CPU system, ranging from the simple to the complex to the ad-hoc. But no matter what a Unix does, it needs to come up with some answer (and ideally document it), and there are no guarantees that two different Unixes have picked the same answer. If you're going to use iowait for much, you might want to try to figure out how your Unix defines it on multi-CPU machines.

(Picking the answer gets more complicated if your Unix wants iowait to be a per-CPU thing, like user, system, and idle time often are, because normally waiting for IO is not naturally associated with any particular CPU. Illumos appears to not consider iowait a per-CPU thing, per a little mention in the mpstat manpage; it does have the idea of iowait in general, per sar(1).)

Written on 05 March 2020.
« Unix's iowait% is a narrow and limited measure that can be misleading
Linux's iowait statistic and multi-CPU machines »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Mar 5 22:39:57 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.