Vendors put varied and peculiar things in system DMI information

April 11, 2021

Recently I read A Ceph war story, which is an alarming story where the day is saved in part by having a detailed hardware inventory that included things like firmware information. This inspired me to think about how we could collect similar information for our modest fleet of Ubuntu servers, which led me to dmidecode, which reports information about your system's hardware as your BIOS describes it according to the SMBIOS/DMI standard(s). When I actually started looking at DMI information for our systems, a number of interesting things started showing up.

The DMI table that dmidecode displays is composed of a bunch of records, of a bunch of different types. For hardware inventories, the most useful types to look at seem to be 'BIOS', 'System information', 'Baseboard' (ie, motherboard), 'Chassis', 'Processor', and 'Memory Module'. The information in each record varies, but many of them have serial numbers, product names, and manufacturers. Processors and memory modules are generally the most verbose, with lots of additional data that you (we) probably want to note down.

In theory DMI is available on most any x86 system. In practice it works best on standard servers from major manufacturers, because DMI is in the BIOS but theoretically contains information about the chassis the motherboard is in and the overall system. If you build a system yourself or a reseller builds it for you from vendor parts (such as our Linux fileservers), information on the chassis and the overall system is unknown to the BIOS and so it will either leave things blank or stick random things there. This and other factors leads to a wide variety of interesting things showing up in DMI data.

It will likely not surprise you that vendors have come up with a huge variety of ways of not having serial numbers, even in the same BIOS (one uses 'System Serial Number' and 'Default string' in different DMI records, for example). Merely being a number is no guarantee that it's a real serial number; there are serial numbers in our fleet (in various records) of 0123456789, 1234567890, 123456789, and 00000000, in addition to things like 'XXXXXXXX'. Some of these also show up in the 'version' field that many records have.

(Nor are serial numbers confined to numbers and hex digits. We have some that have /s, like paths, and quite a few with dots at the start and the end.)

In other places, some BIOS vendors have left helpful instructions for their OEM partners in field values such as 'To be filled by O.E.M.'. If you guessed that the OEM did not fill in this information, you would be correct. There are also plenty of fields with generic contents like 'Chassis Manufacture' (sic) and 'System Product Name', plus obvious things like 'Not Specified' and 'NONE'.

The DMI processor information has turned out to be interesting to handle because some vendors throw in extra processors that aren't there. This caused me to start capturing the 'Status' of each processor, which turned up other interesting cases. One machine reports that most of its processors are 'Populated, Idle' instead of 'Populated, Enabled' (despite them being in use at the time), and one machine claims that all of its processors are 'Populated, Disabled By BIOS'.

Memory modules are the things that seem to most often have real serial numbers, but even there it's not universal. Some modules don't have serial numbers, some have clearly bogus ones, and some have real looking ones that are duplicated between modules on the same system. This latter bit could be a BIOS issue, because obviously the BIOS is building part of the DMI table dynamically.

I wish that vendors would leave fields blank when they have no information, but I suspect that vendors have tried that and found it causes problems. For example, Supermicro is generally well regarded and they're one of the people who use obviously bad, all-numeric serial numbers in some records. I suspect that they have a reason for that.

(This sort of elaborates on some tweets.)

Written on 11 April 2021.
« Why NFS servers generally have a 'reply cache'
Counting how many times something started or stopped failing in Prometheus »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Apr 11 00:54:34 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.