DKMS built one of my kernel modules for the wrong kernel

May 9, 2021

I recently tweeted:

Today I discovered that DKMS has spent some time silently (re)building one of my modules for the wrong kernel because DKMS. Naturally it didn't work. Since it was my sensor monitoring, I didn't notice for a while.

There is a complicated story here. This happened on my office workstation, which needs a very out of tree version of the it87 module in order to read the motherboard sensors. Because of ongoing problems in the 5.11 kernel series with my Radeon RX 550 card, I've been repeatedly upgrading my kernel to the latest Fedora 5.11.x, finding out that the kernel is no good, and falling back to the last-good kernel, which is Fedora's 5.10.23.

Recently (while in the default state of running 5.10.23), I noticed that I didn't have my usual motherboard sensor readings. Examining kernel messages for it87 problems, I immediately found the smoking gun:

[Fri Apr 23 15:17:51 2021] it87: version magic '5.11.15-200.fc33.x86_64 SMP mod_unload ' should be '5.10.23-200.fc33.x86_64 SMP mod_unload '

At that point I had 5.11.15 installed, making it the highest-version kernel, but I was running 5.10.23. This should be a supported system configuration but apparently DKMS somehow installed the 5.11.15 version of it87 (built when I installed that version) into 5.10.23's module area as well as 5.11.15's. So I told DKMS to remove the module and rebuild it. Surprise:

[Sat May  8 17:43:45 2021] it87: version magic '5.11.15-200.fc33.x86_64 SMP mod_unload ' should be '5.10.23-200.fc33.x86_64 SMP mod_unload '

Since 'dkms status' showed that the it87 module was removed before I had DKMS rebuilt it, DKMS knew what the correct kernel version was (because it installed the new it87 module there), but it rebuilt it for the wrong kernel version. What I wound up doing was removing the 5.11.15 RPMs entirely. This finally made DKMS build the module right. Possibly I could also have made DKMS work right by explicitly specifying the kernel version in 'dkms build' and 'dkms install'.

In the future I'm probably going to explicitly specify the kernel version for every DKMS build and install, even if it's the current running kernel. I'm also going to have to check that DKMS installed modules are for the right kernel, especially if I'm building them in some unusual situation. And obviously I now have a mental note to check that all my sensors still work after every reboot.

Somewhat to my surprise, DKMS is actively maintained in the DKMS git repository. But it is still a 3,935 line Bash script (which is up slightly from 2016). It's really a marvel that it works as well as it does, but on the other hand it's somewhat terrifying that so many Linux systems depend on it working reliably.

(One of the fun things about using DKMS on Fedora is that it rebuilds the initramfs for every installed kernel every time you install a DKMS module, regardless of which kernel you're installing the module into and whether or not the module would be included in the initramfs. This takes a substantial amount of time and there's no way to turn it off.)

Update: This turns out to be a significant Fedora issue instead of a DKMS bug, although DKMS could do more to defend against it (since DKMS knows a particular feature can't work on Fedora).


Comments on this page:

Bash is a good sign. You know when it's written in Bash the devs want it to be able to run anywhere any time: future or past. Somehow, despite Bash getting new backwards incompatible features added regularly like any other language, you never need to use a container, virtual environment, or whatever to run a bash script. It just runs.

By Arnaud Gomes at 2021-05-10 05:33:57:

I seem to recall a post of yours a few years ago about the main difference between the shell and python being that a shell program is basically just glue between external commands. Isn't it what DKMS is?

   -- A
By cks at 2021-05-10 11:05:54:

The problem with using the shell for something like DKMS is that it's quite hard to write and maintain large shell scripts. Even Bash lacks important data structures and support for things like named and numbered arguments, never mind data types. Straight Bourne shell is worse. DKMS has a lot of logic to implement and things to keep track of, in addition to running a lot of external commands, and those things are hard to manage in shell scripts.

(Shell scripts are also constrained to be in a single file, unless you do very creative things. You can have multiple shell scripts involved and that helps, but it's not the same as being able to structure your code in small pieces.)

Written on 09 May 2021.
« Storing ZFS send streams is not a good backup method
Errors during SMTP conversations aren't trustworthy, illustrated »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun May 9 23:17:35 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.