Process states from /proc/[pid]/stat
versus /proc/stat
's running and blocked numbers
We recently updated to a version of the Prometheus host agent that can report on
how many processes are in various process states.
The host agent has also long reported node_procs_running
and
node_procs_blocked
metrics, which ultimately come from
/proc/stat
's procs_running
and procs_blocked
fields.
Naturally, I cross-compared the two different sets of numbers.
To my surprise, in our environment they could be significantly
different from each other. There turn out to be two reasons for
this, one for each /proc/stat
field.
As far as procs_running
goes, it was always higher than the
number of processes that Prometheus reported as being in state
'R
'. This turns out to be because Prometheus was counting only
processes, because it looks at what appears in /proc
, while
procs_running
counts all threads. When you have a multi-threaded
program, only the main process (or thread) shows up directly in
/proc
and so has its /proc/[pid]/stat
inspected. Depending on
how the threading in your program is constructed, this can give
you all sorts of running threads but an idle main process.
(This seems to be what happens with Go programs, including the
Prometheus host agent itself. On otherwise idle machines, the
host agent will routinely report no processes in state R
but
anywhere from 5 to 10 threads in procs_running
. On the same
machine, directly 'cat
'ing /proc/stat
consistently reports one
process running, presumably the cat
itself.)
The difference between procs_blocked
and processes in state
'D
' is partly this difference between processes and threads, but
they are also measuring slightly different things. procs_blocked
counts threads that are blocked on real disk IO (technically block
IO), while the 'D
' process state is really counting processes
that are in an uninterruptible sleep (in the state
TASK_UNINTERRUPTIBLE
, with a caveat about 'I
' processes from
my earlier entry). Most processes in state 'D
'
are waiting on IO in some form, but there are other reasons processes
can wind up in this state.
In particular, processes waiting on NFS IO will be in state 'D
'
but not be counted in procs_blocked
. Processes waiting for
NFS IO are part of %iowait
but since they
are not performing actual block IO, they are not counted in
procs_blocked
. You can use this to tell why processes (or
threads) are in an IO wait state; if procs_blocked
is high,
they are waiting on block IO, and if they are just in state 'D
',
they are waiting for something else.
(I believe that anything that operates at the block IO layer will
show up in procs_blocked
. I suspect that this includes iSCSI,
among other things.)
Since we make a lot of use of NFS and some machines can be waiting on either NFS or local IO (or sometimes both), I suspect that we're going to have uses for this knowledge. It definitely means that we want to show both metrics in our Grafana dashboards.
|
|