Two views of CPU utilization (a realization)
The traditional way to present CPU utilization in metrics dashboards
and the like is as a percentage from 0 to 100. This is so common
and ordinary that I wrote an entry on generating this from
Prometheus CPU metrics without ever questioning
things, and the Linux version of top
is sometimes mocked for
showing process CPU utilizations of over 100% because it considers
'100%' to be 'all of one CPU' on multi-CPU machines (which is to
say pretty much all of them these days). But recently it struck
me that this view of CPU utilization is only one of at least two
ways to look at it.
The customary 0% to 100% measure is really a measure of how much
of the machine you're using and how much you have left. If you're
at 75% CPU utilization, you're using three quarters the machine and
have a quarter of it left (more or less). This is a perfectly fine
measure and often what you care about, but it's not the only measure.
Another measure is what the Linux 'top
' command tells you, which
is how much CPU you're using, or to put it another way, how many
CPUs you're using. How much CPU you're using is generally going to
be a better view into how much work is being done by various things,
without having to mentally re-scale a 0% to 100% number to account
for things like how 10% of a 4-CPU machine is a lot less work being
done than 10% of a 112-CPU machine.
Of course 'how many CPUs are we using here' isn't a perfect measure either, unless your CPUs are uniform (ours are far from it, so 100% of a CPU on machine A may be much less actual performance than 100% of a CPU on machine B). But it's a starting point, just as the customary 0-100% of the machine is a customary starting point for how loaded down the machine is. Which starting point you want depends on what questions you're interested in asking or seeing answers for.
As a pragmatic matter, people are often more worried about their
machines falling over from being overloaded than they are curious
about how much computation they're doing (and we're certainly no
exception). This makes the 0-100% CPU utilization measure a good
one to look at on a dashboard or the like, and indeed even Linux
'top
' displays overall system utilization this way (even as it
displays per-process 'utilization' as how much CPU it's using).
But now that I've thought of it, I'm going to keep my mind open
about the 'how much CPU are we using' view too, and think about if
I want to look at that at some point (and how best to visualize
it).
|
|