What's lost when running the Prometheus host agent as a non-root user on Linux

September 12, 2022

If you start up the Prometheus host agent as root, it will nag at you about this:

caller=node_exporter.go:185 level=warn msg="Node Exporter is running as root user. This exporter is designed to run as unprivileged user, root is not required."

This is not quite true, although how much it is and isn't true has varied over time, kernel versions, and also on what host agent information collectors you have enabled. Today, for my own reasons, I decided to get current information on what metrics you lose when you run the current version of the host agent as a non-root user, primarily on an Ubuntu 22.04 server.

(As I write this, the current version of the host agent is 1.3.1, released December 1st 2021. The host agent doesn't see much change.)

With the default collectors, it turns out that all you lose access to is the CPU frequency (which is only available for AMD processors, not x86 Intel ones) and RAPL (Running Average Power Limit) information. In non-default collectors, you also lose access to the 'perf' collector, which by default would give you metrics on various low level CPU performance statistics, such as the number of CPU branch misses and the number of instructions executed.

(On my home desktop, these perf stats reveal that apparently two of my twelve CPUs execute a vastly disproportionate number of instructions and have unbalanced numbers for various other things.)

The latest development version of the host agent also has a slabinfo collector. Since /proc/slabinfo itself is only readable by root, this collector also only works if you run the host agent as root.

In general the host agent collects most of its information through reading things from /proc and sysfs. If some source of information in them is only accessible by root, normally the host agent won't be able to get that information as a non-root user.

This is fewer missing metrics than I expected. We currently run the host agent as root and we'll probably continue to do so, but if we wanted to switch we wouldn't lose very much. However, results in different environments may vary (especially with different kernels), so you probably should check it yourself.

(It's easy enough to start a copy of the host agent on an alternate port, then query localhost:<port>/metrics manually with curl to see what differed. I generally grep for the '# HELP' lines and diff the root and non-root versions.)

Written on 12 September 2022.
« The amount of memory in basic 1U servers and our shifting views of it
My Firefox addons as of Firefox 104 (they haven't changed in a while) »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Sep 12 22:28:02 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.