2024-05-22
The Prometheus host agent's 'perf' collector can be kind of expensive
Today I looked at some system statistics for the first time in quite a while and discovered that both my office and home desktops were handling about 12,000 to 15,000 interrupts a second. On my office desktop this used about 1.5% of the overall (multi-)CPU for IRQ handling; on my home desktop it was just over 4%. Eventually I traced this down to me having enabled the Prometheus host agent's 'perf' collector. This collector uses the Linux kernel's perf system (also) to collect information on CPU information like the number of instructions and cycles, hardware cache information on various sorts of hits and misses, and some kernel information like page faults, context switches, and CPU migrations (some of which is available from other sources).
The extra interrupts were specifically coming from what /proc/interrupts calls 'LOC' and labels as 'Local timer interrupts', and were distributed basically evenly across all CPUs. The underlying cause is a mystery to me; after a certain amount of delving into the relevant host agent code and doing some hackery, the best I can tell is that having enough specific perf CPU information to collect enabled at once appears to trigger this. It's possible that it happens in general for enough perf information sources being enabled at once.
(If you're trying to test this yourself today with the host agent,
note that in its current state you can't disable any 'hardware'
(CPU) or 'software' (kernel) profilers. This is probably a bug in
the code. I had to build a modified version that let me do this.
However, this appears to reproduce in 'perf stat
' if you enable
enough things at once.)
The host agent's documentation does warn generically that the collectors that are disabled by default may have 'significant resource demands', so probably this isn't entirely surprising. The perf collector also isn't critical for my usage, because all I was using it for was to get the information for some 'total (CPU) instructions' and 'cycles per instruction' graphs on a personal Grafana dashboard about my desktops (I was previously collecting this information in another way).
PS: This is probably specific to some ranges of kernel versions, since I couldn't reproduce it with 'perf stat' on one of our Ubuntu machines. My desktops are both running the Fedora 40 6.8.9 kernel. That this is kernel and perhaps hardware dependent is a little bit irritating, since it means we'll have to keep an eye out for this on an ongoing basis if we ever enable the perf collector on our Ubuntu machines.
(This elaborates on some Fediverse posts.)