Wandering Thoughts archives

2021-10-31

Linux puts a bunch of DMI information into sysfs for you

The traditional way I've looked at DMI and SMBIOS information is through dmidecode (eg to check RAM DIMM information). I've done this interactively, and recently when I was inspired to automatically collect this information and dump it into our metrics system, I also used dmidecode as the core of the script (and in the process, I discovered that vendors put peculiar things into there). However, it turns out that for core DMI information you don't actually need dmidecode these days, because the Linux kernel puts it into sysfs.

In sysfs, DMI information is found at /sys/class/dmi/id, which is a symlink to /sys/devices/virtual/dmi/id. This will commonly expose the DMI 'BIOS', 'Base Board', 'Chassis', and 'System Information' sections as bios_*, board_*, chassis_*, and product_*. The sys_vendor file is the Vendor field from the DMI 'System Information' section. There's also a modalias file that summarizes much of this if you want to have it all in one spot.

(The various bits of modalias are prefixed with magic identifiers that you can find listed in drivers/firmware/dmi-id.c. The actual mapping of DMI information to Linux's internal kernel names for them is handled in drivers/firmware/dmi_scan.c, in dmi_decode().)

As you can see, this sysfs information doesn't capture everything that DMI can tell you about your system; for instance, it won't capture information on RAM DIMMs. If you want that sort of data, you're probably going to need to keep using dmidecode.

(Sysfs also has raw DMI information in /sys/firmware/dmi, but I haven't looked into what's there and how you would extract things from it. My view is that you might as well use dmidecode unless you have a good reason not to.)

One potential advantage or drawback of /sys/class/dmi/id is that much of the information in it is accessible without special permissions, while dmidecode requires you to be root. However, serial number and product UUID information is only accessible by root, so even if you're using /sys/class/dmi/id for reporting, you need to run as root if you want to capture that information. Interestingly, the board_asset_tag and chassis_asset_tag files are world readable, but in practice neither has useful content on most of our systems (although a couple do have specific identifying information in the chassis asset tag).

PS: In the future the Prometheus host agent will expose this information as metrics, due to this host agent pull and its associated dependencies. This closes a long standing host agent issue, which goes to show that sometimes very old issues do get resolved.

linux/DMIDataInSysfs written at 22:46:16; Add Comment

Python 3 forced its own hand so that standard input had to be Unicode

In a comment on my entry on dealing with bad characters in stdin, Peter Donis said:

I've always thought it was a bad idea for Python 3 to make the standard streams default to Unicode text instead of bytes. [...]

I'm sure this was a deliberate design choice on the part of the Python developers, but they tied their own hands so that it would have been infeasible in early versions of Python 3 to make sys.stdin bytes instead of a Unicode stream. The problem is that the initial version of the bytes type was fairly minimal, and in particular the bytes type did not have any formatting operator until Python 3.5 added the traditional % formatting through PEP 461.

(Even today bytes doesn't have a .format() method. I'm honestly surprised that the Python 3.0 bytes type had as many string methods as it did.)

Since Python 3.0 bytes are basically the Python 2 str type, this pretty much has to have been deliberately removed code, not code that the Python developers didn't write. As part of the philosophy of Python 3, they decided that you should only be able to do what the PEP calls 'interpolation' on Unicode strings, not on un-decoded bytes.

Without a formatting operation on bytes, you can't really do too much with them in Python 3.0 other than turn them into Unicode strings. You certainly can't do the kind of stream processing of standard input (and writing to standard output) that's normal for a lot of filter style Unix programs. In this environment, making sys.stdin return bytes instead of Unicode strings is only going to annoy people. It's also asymmetric with sys.stdout and sys.stderr, again partly because of formatting. Since you can only format Unicode strings and people are going to want to format quite a lot of what they print out, those pretty much have to accept Unicode strings. Unless you want to make Unicode strings automatically convert to bytes, this pushes you to sys.stdout and sys.stderr being text, not bytes.

(Python 3 has to do this automatic conversion from Unicode to bytes on output somewhere, but doing ot when the actual IO happens and making sys.stdout be (Unicode) text is more readily understood. Then the magic conversion is in the OS specific magic layer.)

All of this fits with Python 3's general philosophy, of course. Python 3 really wants the world of text to be Unicode, and that includes input and output. Providing standard input as bytes and making it easy to process those bytes without ever turning them into Unicode would invite a return to the Python 2 world where people processed text in non-Unicode ways. Arguably, Unicode text processing is the reason for Python 3 to exist, so it's not surprising that the Python developers were so strongly against anything that smelled like it.

python/Python3StdinUnicodeForced written at 00:09:33; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.