The mystery of why my Fedora 30 office workstation was booting fine
The other day, I upgraded the kernel on my office workstation, much as I have any number of times before, and rebooted. Things did not go well:
So the latest Fedora 30 updates (including a kernel update) build an initramfs that refuses to bring up software RAID devices, including the one that my root filesystem is on. Things do not go well afterwards.
Then I said:
Fedora's systemd, Dracut and kernel parameters setup have now silently changed to require either rd.md.uuid for your root filesystem or rd.auto. The same kernel command line booted previous kernels with previous initramfs's.
The first part of this is wrong, and that leads to the mystery.
In Fedora 29, my kernel command line was specifying both the root
filesystem device by name ('root=/dev/md20
') and the software
RAID arrays for the initramfs to bring up (as 'rd.md.uuid=...
'). When I upgraded to Fedora 30
in mid-August, various things happened
and I wound up removing both of those from the kernel command line,
specifying the root filesystem device only by UUID ('root=UUID=...
').
This kernel command line booted a series of Fedora 30 kernels, most
recently 5.2.11 on September 4th, right up until yesterday.
However, it shouldn't have. As the dracut.cmdline
manpage says,
the default since Dracut 024 has been to not auto-assemble software
RAID arrays in the absence of either rd.auto
or rd.md.uuid
.
And the initramfs for older kernels (at least 5.2.11) was theoretically
enforcing that; the journal for that September 4th boot contains a
report of:
dracut-pre-trigger[492]: rd.md=0: removing MD RAID activation
But then a few lines later, md/raid1:md20 is activated:
kernel: md/raid1:md20: active with 2 out of 2 mirrors
(The boot log for the new kernel for a failed boot also had the dracut-pre-trigger line, but obviously no mention of the RAID being activated.)
I unpacked the initramfs for both kernels and as far as I can tell
they're identical in terms of the kernel modules included and the
configuration files and scripts (there are differences in some
binaries, which is expected since systemd and some other things got
upgraded between September 4th and now). Nor has the kernel
configuration changed between the two kernels according to the
config-* files in /boot
.
So by all evidence, the old kernel and initramfs should not auto-assemble my root filesystem's software RAID and thus shouldn't boot. But, they do. In fact they did yesterday, because when the new kernel failed to boot the first thing I did was boot with the old one. I just don't know why, and that's the mystery.
My fix for my boot issue is straightforward; I've updated my kernel
command line to have the 'rd.md.uuid=...
' that it should have had
all along. This works fine.
(My initial recovery from the boot failure was to use 'rd.auto
',
but I've decided that I don't want to auto-assemble anything and
everything that the initramfs needs. I'll have the initramfs only
assemble the bare minimum, just in case. While my swap is also on
software RAID, I specifically decided to not assemble it in the
initramfs; I don't really need it until later.)
|
|