The Linux load average does mean something (although maybe not much)

July 9, 2022

One of the things you'll hear about monitoring your systems is that the load average is not really a metric that you should pay attention to, and so perhaps things like an IMAP server with an elevated load average or a login server with periodic load spikes are not worth caring about. There is something to be said for this, and I've come to think that load average is a secondary indicator, but I also think that the Linux load average can still tell you things that matter.

The first thing the Linux load average may tell you is how many tasks are waiting for IO. The amount of time there's something waiting is tracked in kernel pressure indicators, but the pressure indicators don't tell you how many things are waiting; load average will, and that might matter. However, this information is sampled only every five seconds for load averages, so it's a relatively coarse indicator.

The second thing the Linux load average may give you is some indication that you had a burst of transient tasks (or transiently active tasks). If you see a spike in the load average but no sign of it in other indicators, then you know that something happened and it can't have lasted very long; for a brief period, you had a lot of tasks that were either runnable or in IO wait. You're probably more likely to see something like this on a big machine with a lot of CPUs, for the simple reason that if you had fewer CPUs, tasks would have started having to wait and you'd see signs of this in other indicators (CPU utilization, CPU and IO pressure, and so on).

(As far as IO goes, remember that Linux's iowait statistic is only a lower bound on multi-CPU machines, which today is almost everything except very small virtual machines.)

Unfortunately, as I discovered, the only way to get high resolution versions of all of the information that goes into the load average is through special interaction with cgroup (and possibly only cgroup v1). Reading /proc/loadavg will give you the instantaneous number of runnable tasks, as will /proc/stat (in 'procs_running'), but the number of uninterruptible tasks is not directly exposed anywhere. The 'procs_blocked' field of /proc/stat counts the number of tasks in IO wait instead of the number in uninterruptible sleep, although perhaps the numbers are often the same.

(The Linux kernel scheduler is sufficiently tangled that it's possible for the two to be basically synonymous, but there may be other commonly encountered ways to get uninterruptible but not running tasks.)


Comments on this page:

By Ben Hutchings at 2022-07-10 08:47:32:

Since Linux 4.2, it's possible to have tasks in uninterruptible sleep but not contributing to the load average (TASK_NOLOAD flag). Since 4.14, that shows up as 'I' state rather than 'D' in procfs. I think this is only supposed to be used for kernel threads.

By jonys at 2022-07-14 08:02:22:

Ben Hutchings: I've seen userspace tasks get the TASK_NOLOAD state when doing IO to a slow LustreFS mount. So it is not exclusive to kernel threads.

Written on 09 July 2022.
« Larger backup systems often operate in multiple stages
My distrust of multi-factor authentication's account recovery story »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Jul 9 21:56:14 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.