Wandering Thoughts

2022-05-23

Systems should expose a (simple) overall health metric as well as specifics

I was asked today if our monitoring of our Prometheus setup would detect database problems. The direct answer is that it wouldn't (unless Prometheus itself went down); much like we do with Alertmanager, all we check for is that Prometheus is up. The more complicated answer is that it seems hard to do this, and one reason is that while Prometheus exposes a number of specific metrics about aspects of its time series database, it doesn't have an overall health metric for the TSDB (or for Prometheus as a whole). I think it should have one; in fact I now think that everything with specific health metrics should also have an overall health metric.

(Probably this means I need to take a second look at some of our own metric generation.)

A system having an overall health metric insulates people consuming its metrics (like us) from two issues. First, we don't have to try to hunt down all of the specific health metrics that the system exposes and sort them out from all of its other metrics. The current version of Prometheus has 129 different metrics (not counting the general metrics from Go), of which I think perhaps eight indicate some sort of TSDB problem. Except I'm not sure that all of the metrics I picked out really indicate failures, and I'm also absolutely not sure I found all of the relevant metrics. If there was an overall health metric, people like me would be insulated from mistakes here in both directions; we wouldn't miss monitoring metrics that actually indicate problems or generate spurious alerts from metrics with alarming names that don't.

(In fact, doing a brute force grep check for TSDB metrics with 'error', 'fail', or 'corrupt' in their name turned up several additions to my initial eight.)

Second, we don't have to worry about an update to the system adding more specific metrics for (new) errors that we should add checks on. A properly done overall health metric should include those new metrics (and also deal with any restructuring of the system and shuffling of specific metrics). Without this, restructuring error metrics and perhaps adding new ones are quietly breaking changes, because they make previously theoretically comprehensive monitoring not so comprehensive any more. At the same time, you don't want to force systems to never add or restructure specific, detailed error metrics, because those specific error metrics need to reflect the actual structure of the system and the disparate things that can go wrong.

This points to the general issue with specific metrics, which is that specific metrics reflect the internal structure of the system. To understand them, you really need to understand this internal structure, and when the internal structure changes the metrics need to change as well. It's not a great idea to ask people who just want to use your system to understand its internal structure in order to monitor its health. It's better for everyone to also give people a simple overall metric that you (implicitly) promise will always reflect everything important about the health of the system.

To be explicit, you could have overall health metrics for general subsystems, such as Prometheus's time series database. You don't have to just have an overall 'is Prometheus healthy' metric, and in some environments an honest overall health metric might alarm too often. I do think it's nice to have a global health metric, assuming you can do a sensible one.

Sidebar: The low-rent implementation of general health metrics

If your system is reasonably decent sized, it probably has some sort of logging framework that categorizes log messages by both subsystem and broad level of alarmingness. Add a hook into your logging system so that you track the last time a message was emitted for a given subsystem at a given priority level, and expose these times (with level and subsystem) as metrics. Then people like me can put together monitoring for things like 'the Prometheus TSDB has logged warnings or above within the last five minutes'.

This is unaesthetic and probably will win you no friends among the developers if you propose it as a change, but it's simple and it works.

(If the log message format is regular, you can also implement this with an outside tool, such as mtail, that parses the program's log messages to extract the level and subsystem (and the timestamp).)

HaveGeneralHealthMetric written at 22:51:37; Add Comment

2022-05-17

Why I'm not all that positive on working through serial consoles

Over on Twitter I said some things about BMCs with Java-only KVM over IP. One of the suggestions was to use serial consoles with Serial over LAN. As it happens, I have very mixed feelings about this and I don't think it's attractive to us (although we do have serial consoles and a serial console server). The problem is that almost all of what we want to use KVM over IP for is very low level, and I don't think this low level stuff will generally work very well with a serial console.

There are two ways for a server serial console to work in the x86 world. The first way is for all of the various consoles of Linux and the programs that use them to be explicitly pointed to a serial port, and for them to be prepared to work through the limited features of a serial port console instead of the video console. This means that the server BIOS, the bootloader, the Linux kernel, the init system, the OS installer, the OS recovery mode, and so on and so forth all have to support configuring a serial connection and being used over serial.

(If you're lucky, many of them inherit their configuration from something else and you can just change that.)

The other way is for some low level thing in the server BIOS or the BMC to intercept video output and send a copy over the serial port, and accept serial port input as if it were console keyboard input. This is moderately straightforward for the actual BIOS (or EFI firmware) but becomes much more complicated once the BIOS hands off control to the next step (a bootloader or worse a kernel that will try to put the video console into some sort of resized framebuffer mode and then draw on it directly). Even in the BIOS, serial keyboard input is often more obscure and less well supported than the real keyboard (and you'd better hope that all parties involved agree on the serial version of, say, F11).

In theory, things like the Ubuntu server installer support working over an explicitly configured serial console (but not necessarily a BIOS redirected one). In practice I don't trust this to be well tested, especially in a wide variety of (serial) environments, simply because it's not very common. It's very likely that almost everything we'll interact with has mostly been written for and tested with the video console; the serial support is likely to be at best a second class thing and at the worst, buggy.

(As a general issue, there's also the problem that you mostly can't have both a video console and a serial console, and if we have to pick one we want a video console.)

PS: It appears that the solution to my specific problem is Supermico's IPMIView program, which even appears to work properly on my HiDPI display (this is better than I used to be able to get).

SerialConsolesUnappealing written at 21:58:50; Add Comment

2022-05-08

Checking if a machine is 'up' for scripts, well, for rsync

One of the ways that we propagate our central administrative filesystem to machines is through rsync; rather than NFS-mounting the filesystems, some machines instead use rsync to pull a subset of the filesystem to their local disk. This gives these machines resilience in case our NFS fileservers aren't available. This rsync is done (in a script) from cron, and so we'll get emailed any problems or odd output. This all works great until the rsync master machine is down, at which point we can count on a blizzard of emails from all of the rsync machines complaining that they can't talk to the master.

On the one hand, we do want to hear about specific rsync problems, so we can't just throw the rsync output and its exit status away. On the other hand, it's not useful to be told at great length that the rsync master is down; we probably already know that because our monitoring and alerting system will have told us. It would be nice to get cron rsync job email only about novel problems, and for things to be silent if the script determines that the rsync master is down.

This opens two cans of worms, both of which are tractable in our specific case. The first can of worms is what it means for a machine to be up. Does it ping? Does it respond to SSH? Are its services healthy? And so on. In our specific case, the rsync is done over SSH (of course) and our monitoring and alerting system is already monitoring the SSH port. So we can say with confidence that if the SSH port on the rsync master isn't responding, our rsync isn't going to work and we're also going to get an alert about it. This means we could use a check like what I use in my 'sshup' script, using nc to see if a TCP connection to the SSH port on the rsync master succeeds.

(This approach can be adopted to check if any particular port is responding, although it has to be a port that's harmless to just connect to and then drop the connection.)

The second can of worms is that for a client machine, the rsync master being down is indistinguishable from a network problem in reaching the rsync master (or the master's SSH port). What helps us here is that pretty much all of our servers are on one subnet; it would be a pretty interesting network problem that left our client server unable to talk to the rsync master but able to talk to our mail sending machine and our metrics machine (so that it looks healthy in metrics). If this was a concern, one approach is to publish a metric for whether or not the rsync was successful and then alert if any machine was unsuccessful for too long. This does require us to be collecting metrics from the machine, but we probably are.

(You're probably already alerting if you can't collect metrics from the machine.)

PS: Even if we can make a TCP connection to the rsync master's SSH port, there are a bunch of general failure modes that could stop all machines from being able to pull stuff via rsync and thus cause a blizzard of complaining emails. However, for us they're vanishingly infrequent failure modes compared to the rsync master just being down, and so we could eliminate almost all of these noise emails with a simple TCP connection check.

CheckRsyncMasterIsUp written at 23:24:18; Add Comment

2022-05-06

Filtering Prometheus metrics with deliberately repeated labels

We have a SLURM "cluster", by which we mean a pool of servers which people can use to reserve some cores and RAM for themselves. Each compute server in the cluster (a node in SLURM terms) needs to be running the slurmd daemon in order for people to be able to use its resources. This daemon can die under some circumstances, so we added an alert to check for slurmd not being active to our Prometheus setup. However, we don't want to alert on slurmd not being active on all of our machines; on machines outside the SLURM cluster, it might be installed but not active for various reasons. Fortunately, all of our SLURM nodes follow a simple naming scheme; they're all called 'cpunodeNN', eg 'cpunode1' or 'cpunode23'. This leads to a straightforward alert rule expression, more or less (using a label for what host the metric comes from):

node_systemd_unit_state { state="active", \
  name="slurmd.service", \
  host=~"cpunode.*" } != 1

Recently we took a couple of our SLURM nodes out of the cluster so they could become test nodes in a new Ubuntu 22.04 based version of the cluster. As test nodes, these may not be running slurmd all of the time, so we don't want to alert about slurmd not being active on them. So we need to exclude them from the alert.

At first I started thinking about clever things to do with the regular expression for which hosts matched, because you certainly can write a regexp that will match all one and two digit numbers except for, say, 9 and 23 (ie, cpunode9 and cpunode23). Then I realized there was a simpler way. I could add a requirement that the host label not be one of those two hosts, through a new label match on host. Like this:

node_systemd_unit_state { ..., \
  host=~"cpunode.*", host!~"cpunode9|cpunode23" } != 1

When you repeat a label like this, you require the label to pass both match conditions. Here, our host label must be both a 'cpunodeNN' name and not cpunode9 or cpunode23. This is exactly what we want and puts the excluded hosts right into the alert rule along side the matched hosts, rather than (say) in our Alertmanager configuration.

Using the same label name in multiple match conditions in a time series selector feels odd and it's certainly unusual. But there's no rule against it in PromQL and it fits into the general Prometheus data model, where your label matchers are just filtering the time series (starting with the name). In fact repeating labels this way is specifically allowed:

Label matchers that match empty label values also select all time series that do not have the specific label set at all. It is possible to have multiple matchers for the same label name.

(Emphasis mine.)

However, this technique of repeated matches of the same label has a limitation; it only works if you can exclude based on a single label. If you need to exclude based on the combination of labels (say 'network interface B on host A', where host A has several network interfaces and a network interface with that name is on several hosts), you have a more difficult challenge. See this entry's sidebar for some notes on this.

These days, PromQL is supported in projects other than Prometheus, often because they want to interoperate with Prometheus users or Prometheus related tools (see an overview of PromQL compliance test results). I don't know if all of these projects support multiple matchers for the same label name (it doesn't appear to be in the current compliance test suite), so if this is relevant to you, you might want to test it yourself.

(I consider this an issue worth thinking about for other PromQL implementations because having multiple matchers for the same label potentially affects your internal data structures and matching code.)

PrometheusRepeatLabelFiltering written at 22:01:05; Add Comment

2022-05-05

When you install systems semi-manually, when updates get done matters

One of our little irritations with the modern Ubuntu server installer is that after it has installed your base system from the packages on the ISO image, it always installs at least security updates (unless it has no network connection to the outside world, which it probably does). A way of turning this off has been a long standing request for the new installer, and when the issue is raised one reaction I've seen is to ask why wouldn't you want to install all the available (security) updates. Well, I have an answer for that.

We absolutely do want to install updates. But we don't to install them from the Ubuntu installer, because our use of the Ubuntu installer is only partially automated and thus require us to sit at the server console (or visit it periodically) until it's done. Our goal is to get this forced console presence portion over with as fast as possible. The faster the system gets booted and on the network, the sooner we can go back to our desks, log in remotely, and get on with the rest of the install as well as our other work. The more work the installer forces to happen during this process, the more irritating it is. We want to postpone everything possible to the system's first boot so we get there as fast as possible.

(This would be different if we were rich enough to buy servers with dedicated BMC/IPMI network ports and KVM over IP licenses, or if the Ubuntu installer let you mirror the installation over a SSH connection so you could start it on the server console then finish it from your desk.)

The reality is that "pre-boot" system installers are a very special environment that suffers from unusual limitations. They are limited and awkward, and often constrain how (and where) you can interact with them. Generally they operate strictly sequentially, even if some of what you want to do could be done in parallel. As a result of this, these installers should offer a way to let them get their work done and get out of the way as fast as possible (ie, booting into the installed system), and this means at least providing the option to do a minimum of work.

InstallTimeUpgradesTiming written at 22:51:56; Add Comment

2022-05-02

Monitoring is too hard, as illustrated by TLS certificates expiring

I tweeted:

Grumpy thesis: monitoring TLS certificate expiry is too hard (evidence: good people keep having certs expire on them). Why don't web servers ship with routine cron jobs that email you when any actively used TLS certificate is N days or less from expiring, for example?

Having a TLS certificate for a public web server unexpectedly expire on you is practically a rite of passage for a system administration team. And I'm not here to throw stones, because while we have a reasonably good system for monitoring our TLS certificates, it's critically reliant on us remembering to add monitoring for the actual TLS website. When the TLS website is a standalone web server, that's fairly easy (because we know we want to check if the site is actually up), but when it's yet another virtual host on our central web server, it's also easy for it to drop through the cracks because we know we're already monitoring the web server as a whole.

As a general rule, when people keep doing something wrong, they're actually right and your system is wrong. Put another way, "if your system depends on humans never making errors, you have a systems problem". If it takes extra steps and extra attention to add monitoring, people will keep forgetting to do so and then they will get burned by it. TLS certificates are an obvious case, but there are lots of other ones. How many systems ship with default monitoring that tries to let you know if the local disk space is getting alarmingly low, for example?

Today, you have to spend a great deal of time and effort to build out a monitoring system for your systems. Once you have built that system (as we have with our Prometheus setup), the incremental monitoring for a new system is easy and it's alarmingly easy to feel smug about your successes and other people's failures. But we're standing on a mountain, and it's a mountain that not everyone has either the time or the expertise to climb.

Of course building systems to monitor themselves by default is not an easy job. However, we've already done some of it (and come to accept it as essentially required for a good quality implementation); for example, Linux systems these days often default to sending email if issues show up in your software RAID arrays or disk SMART attributes. We could do more, especially since there's a lot of obvious low hanging fruit.

It would be nice if in the future 'default monitored' was like 'default secure' is becoming today. You could change it or replace it, but at least things would start out in a good place.

MonitoringTooHard written at 21:18:17; Add Comment

2022-04-09

Understanding the effects of PAM module results ('controls' in PAM jargon)

To simplify, PAM is the standard system on Linux (and often other Unixes) for configuring how applications such as SSH handle various aspects of authenticating and logging people in. PAM splits this work up into a collection of PAM modules (sometimes 'PAMs', since PAM is theoretically short for 'Pluggable Authentication Modules'), which are then configured in per application (or 'service') stacks. Each individual PAM module can succeed, fail, or signal other conditions, and the PAM stack for a given application tells you not only what order PAM modules should be checked in, but what the effects of their results are. In the jargon of pam.conf, this is the control of a PAM module.

There are two forms for these controls; the 'historical' and still very common single-word form that uses things like 'required' and 'sufficient', and the newer, more detailed and complicated syntax. There are two descriptions of the meanings of the historical single-word forms in pam.conf; the somewhat informal main description, and then the potentially more explicit version that restates them in terms of the newer and more detailed syntax. All of this makes for a system that can be hard to understand and follow.

(In the following I'm going to assume that you've read through the pam.conf manual page to somewhat understand the various controls, or that you know PAM well enough to not need to.)

Taken from CentOS 7, here is one example of the standard 'auth' stack:

auth required    pam_env.so
auth sufficient  pam_unix.so nullok try_first_pass
auth requisite   pam_succeed_if.so uid >= 1000 quiet_success
auth required    pam_deny.so

Let's walk through this and try to understand its aggregate effects.

  1. Pam_env sets environment variables; since it's required, if it fails for some reason the authentication process will appear to continue, but at the end it will be considered to fail.

    Pam_env normally succeeds; as far as I know it fails only in exceptional circumstances.

  2. Pam_unix verifies Unix passwords. It's 'sufficient', so if it succeeds the authentication is done. However, if it fails, further modules in the auth stack continue to be checked, and I believe that later modules could accept the authentication.

  3. Pam_succeed_if will succeed if the UID is 1,000 or higher and fail otherwise. Since it's set as 'requisite', if it fails (the UID is under 1,000), authentication immediately fails, but if the UID is 1,000 or higher, the stack continues to check further modules.

  4. Pam_deny does what you'd think; it always says no. Since it's 'required', this failure is not immediately fatal, but since it's the last module listed, its failure will cause the entire stack to fail and authentication to not be successful.

Under normal circumstances, if you enter a correct password the stack will stop at pam_unix and consider you authenticated. If you enter an incorrect password, you will eventually fail in the rest of the stack, but possibly with different error messages depending on your UID. If your UID is under 1,000, the failure will be caused by pam_succeed_if; if your UID is 1,000 or over, it will be caused by pam_deny. This two-sided failure is somewhat confusing; it's not clear what CentOS 7 is up to.

The Ubuntu 20.04 stack is as confusing in its own way, although it has comments. Here it is:

auth [success=1 default=ignore]  pam_unix.so nullok_secure
auth requisite   pam_deny.so
auth required    pam_permit.so
auth optional    pam_cap.so
  1. Pam_unix with the 'success=1' syntax will skip the next module on success and otherwise do nothing.
  2. Pam_deny with 'requisite' will immediately fail the authentication if it's processed, but it will only be processed if pam_unix did not succeed, ie if you got the password wrong.

  3. Pam_permit with 'required' apparently is present simply to, to quote the comments in the file:

    prime the stack with a positive return value if there isn't one already; this avoids us returning an error just because nothing sets a success code since the modules above will each just jump around

    If it fails for some inexplicable reason, as a 'required' module it will make the authentication fail. But we only got to it if we skipped pam_deny because you entered the right Unix password.

  • Pam_cap "sets the current process' inheritable capabilities", to quote its manual page. Since it's 'optional', its result appears not to matter, but the stack will have been set to succeed by pam_permit if we got here.

The first two PAM modules here and their setup seem to have the effect of failing the authentication if you enter a bad Unix password and otherwise letting the rest of the stack go on, but I'm not sure why Ubuntu needs two modules where making pam_unix a 'requisite' module seems like it should work.

(The one difference that jumps out to me is that this way, if the pam_unix module returns 'new_authtok_reqd', meaning 'the user needs to change their password', the Ubuntu stack will fail and a 'requisite' would make things succeed. See eg here. In the same situation, I believe the CentOS 7 authentication stack will succeed.)

There's an entire additional level of complexity in real world PAM usage (and in understanding the effects of changes to your system's PAM stuff), but that's a topic for another entry.

PAMModuleResultsEffects written at 22:16:55; Add Comment

2022-04-08

On the ordering of password and MFA challenges during login

Suppose that you have a system where to log in or authenticate, people must both have a password and pass a MFA challenge. In some such systems, you can choose whether the MFA challenge or the password comes first (for example, this is generally the case with SSH because both passwords and MFA are usually done in PAM). When this came up for us in our move from Yubikeys to MFA, I thought about it a bit and decided that passwords should come first.

SSH logins are a somewhat unusual environment because the server starts out knowing the login name. In environments such as web single sign on, you often don't start out knowing the login name, so it's natural to gather the login and the password together, then go to MFA. If your MFA is selective, it's possible that you need to authenticate the user before you have information available on whether MFA is even on; your MFA environment may not handle being sent a non-MFA user to check. So you may be forced to verify passwords first by the requirements of the software or system.

But if you do get to choose, my view is that you should put passwords first (and then stop if password authentication fails). The problem with putting MFA first is that you're providing attackers with the ability to spam your users with MFA challenges delivered to their devices, and possibly setting off rate limits and 'this login should be temporarily disabled' things in your MFA system (or your MFA provider, if you've outsourced this). In some environments, this is not even an actual determined attacker but instead some automated script that's probing accessible systems on the Internet with a canned list of login names. Needless to say, you don't want to give random people on the Internet the ability to spam your people with MFA challenges.

The argument for doing it the other way around is to block password guessing attacks. If you check passwords first, an attacker can determine if they have the right password by whether or not they get an MFA challenge; once they have the password right, they can fire up their MFA attack for only a single authentication run. If you force people to first pass the MFA challenge, life is more difficult for the attacker who wants to attack or verify passwords.

For systems that are generally visible and generally open to probing, I come down strongly on the side of passwords first because of the issue of MFA spam. For internal systems that already have very limited access, I guess you can make an argument for MFA first. But if they have limited access and low usage, perhaps you should leave passwords first and have alarms trigger if there's any probing of them worth mentioning.

MFAAndPasswordOrdering written at 22:33:37; Add Comment

2022-04-03

Some notes on using snmpwalk to poke at devices with SNMP

Suppose, not hypothetically, that you're exploring what some device exposes over SNMP. You're probably using snmpwalk (or perhaps snmpbulkwalk). Snmpwalk is probably completely clear to people who use it all of the time, but it's not clear to people like me who touch it only once in a while. So here are some notes from my recent experiences (for future me, among others), assuming general background in reading SNMP.

To start with, many of the interesting snmpwalk options are covered in the snmpcmd manual page because they're common across all of the 'snmp*' commands. Generally you'll want to specify some SNMP version (often '-v2c', but some old devices may require '-v1') and the public community ('-c public'). If you have collected vendor MIBs for your device and put them in a directory, you'll want to add that directory to the search list and then load some or all MIB modules:

snmpwalk -M +/tmp/mibs -m +VENDOR-MIB1:VMIB2 -v2c -c public ....

Often '-m ALL' is what you want, which tells snmpwalk to load all of the MIBs it can find. Generally you don't want to use -M without a + or a -, because vendor MIBs will try to import things from standard MIBs and fail if the standard MIB search path has been scrubbed away (ask me how I know).

As the snmpwalk manual page says, if you don't give it an OID to start from it starts from SNMPv2-SMI::mib-2, 1.3.6.1.2.1, and only looks at things underneath that. If you're exploring a device's SNMP information, this is probably not what you want. If you want to see everything, start from OID 1:

snmpwalk [...] example.com .1

(The leading '.' forces snmpwalk to consider this a fully qualified OID and start from the root.)

If you're pretty certain that what you're looking for is in the proper vendor-specific OID space, you can start from there:

snmpwalk [...] example.com .1.3.6.1.4.1

You can further restrict this to a specific vendor OID, ege 1.3.6.1.4.1.21317, and this can be useful in various situations (there are some standard things under this 'vendor' OID prefix, such as a standard set of Net-SNMP MIBs). If you know the root identifier name of a MIB you're interested in, you can also specify where to start from as '<MIB>::<root>', eg 'ATEN-IPMI-MIB::aten'; however, this may require reading the MIB to find the identifier name.

(If you're a keen person you can remember that 1.3.6.1.4.1 is also known as 'SNMPv2-SMI::enterprises', and actually it can be shortened to 'enterprises'.)

If you're lucky, you'll have a vendor MIB and it will be accurate. If you're not, you're trying to explore what information is there and you'll probably start wanting some snmpwalk -O output options. One useful one is '-Oa', which gets snmpwalk to display more things as strings (there's also '-OT', but really why bother). If you want to see what OIDs are being found under some level, you may want to de-clutter the output by forcing all numeric OIDs with '-On'.

As far as I know, there's no option to snmpwalk to tell it to exclude output for known MIBs and only give you unknown OIDs, which would make it convenient to fish for non-standard areas of the device's SNMP OID tree. The best you can do is snmpwalk things and fish for appropriate patterns, like:

snmpwalk -v2c -c public example.com enterprises |
  fgrep SNMPv2-SMI::enterprises.

It turns out that another way to find the specific vendor OID root is to query for the special OID SNMPv2-MIB::sysObjectID:

; snmpwalk -v2c -c public example.com sysObjectID
SNMPv2-MIB::sysObjectID.0 = OID: SNMPv2-SMI::enterprises.21317

(This special OID is 1.3.6.1.2.1.1.2 if you need it in OID form, which you may do on some systems where installing SNMP tools doesn't also pull in standard MIBs for you. Thanks, Ubuntu.)

Note that this may not give you all of what the vendor stashes away in SNMP. On one of our Dells that I checked, this reports only about enterprises.674.10892.5, when there is other stuff at enterprises.674.10892.2 as well.

SnmpwalkNotes written at 22:15:07; Add Comment

2022-04-01

Some notes on finding and reading information over SNMP

Suppose, not hypothetically, that you have some servers of interest which provide interesting information through their BMC's web server that they don't expose through IPMI in any non-proprietary way. But their BMC does support SNMP, the Simple Network Management Protocol, and perhaps they expose the information in SNMP. You would like to peer through the information the BMC exposes over SNMP, which requires actually understanding a bit of SNMP. Since I went through this today, here are some notes (for a future me if no one else).

SNMP is of course not simple. The information you get over SNMP is organized into a tree of OIDs, which are dot-separated decimal identifiers that look like 1.3.6.1.4.1 (except they get much, much longer). What a given area of the OID tree actually means is described by MIBs, and generally you want to have the relevant MIBs in order to understand what you're seeing. Some MIBs are standard and probably come with your SNMP tools, but a lot of interesting information exposed through SNMP is vendor specific and so you need to get the vendor MIBs. In theory the vendor websites should be the best place to get these. In practice, vendors make this a giant pain, so everyone goes to various sites that collect the MIBs, index them, archive them when vendors remove them from their website, and let you download them in an easy and uniform way.

(For example, 1, 2, 3, 4, 5, 6, 7. The way you find these sites is to do a web search on some interesting OID.)

A very important OID tree that you will see a lot of starts at "1.3.6.1.4.1.". This is the normal SNMP 'private enterprise' top OID, and vendors are assigned unique numbers under this. If you're looking for vendor specific SNMP information, it's very often found somewhere under here under the numeric identifier for the vendor, such as "1.3.6.1.4.1.21317". A hypothetically complete list of vendor numbers is maintained by IANA in enterprise-numbers. Searching this list will show you, for example, that 21317 is ATEN International. When you're specifically dealing with BMCs, the BMC is often made by someone other than the main system maker. This can also be used to determine who makes your BMC if you're not sure. If you use SNMP tools to scan under 1.3.6.1.4.1 and entries appear under .21317., you can be pretty sure you have an ATEN BMC regardless of whose name appears on the server case.

Sometimes you may wind up reading MIBs to see what's in them, and in these circumstances you may want to have a simple way to map between declarations in the MIB and OIDs, so you can go poke your device to see if it has useful OIDs. In the MIB format I've seen so far, OIDs are declared relative to other things; for example:

aten MODULE-IDENTITY
  [....]
  ::= { enterprises 21317 }

ipmi      OBJECT IDENTIFIER ::= { aten 1 }
[...]
powerinfo OBJECT IDENTIFIER ::= { ipmi 14 }

The 'enterprises' OID is imported earlier in the file, so we can see that the root of the OID tree here is 1.3.6.1.4.1.21317, the ATEN vendor OID. Within that, .1 is information from the IPMI, and .1.14 is power information. If we query our BMC for 1.3.6.1.4.1.21317.1.14 and get nothing, we can sadly conclude that while the ATEN MIB holds out the possibility of this information being available through SNMP, it does not actually appear in our BMC's SNMP information, for whatever reason.

On top of this sadness, even having a MIB doesn't tell us how to interpret the values that we read out via SNMP. For instance, the Supermicro ATEN IPMI MIB will tell us that there is a PSU "Input Voltage", but the only thing it tells us about how the voltage is reported is:

inputVoltage OBJECT-TYPE
  SYNTAX   OCTET STRING (SIZE(16))

What those 16 bytes mean is opaque from this information alone. Similarly, there are PSU temperatures in that MIB, but all they say about what units is:

temperature1 OBJECT-TYPE
  SYNTAX   Integer32 (1..256)

This could be degrees Celsius or tenths of degrees Celsius or something else entirely. You can read fun stories of people trying to reverse engineer SNMP conversion factors for motherboard voltages, for example.

(While MIBs can have textual descriptions of what each OID may mean, don't count on them to be useful. A significant portion of the sensor OID descriptions in this ATEN MIB I'm looking at appear to have been copied from OID descriptions for network interfaces. That no one noticed or fixed it tells you some things about how MIBs are used and how much I expect vendors care about them.)

PS: In this specific case, we have another machine that does appear to expose this information, so I can say that the input voltage appears to be the voltage in ASCII and the temperature seems to be degrees C. Exposing readings as ASCII seems to be common in SNMP; we have other devices that do it.

SNMPReadingNotes written at 22:28:45; Add Comment

(Previous 10 or go back to March 2022 at 2022/03/26)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.