Fixing your system after hitting the RAID growth gotcha
The easiest way to dig yourself out of the hole created by the RAID
growth gotcha is probably to use a live/rescue CD.
But let us suppose that you don't have one handy, which was the case
for me yesterday. Further let us suppose that you have
/boot as a
separate filesystem, not as part of the root
filesystem (if this is not the case, you absolutely need a rescue CD;
The basic goal is to rebuild a version of your current initrd that has
/etc/mdadm.conf that specifies that your root mirror has
the right number of devices. Since we can't boot the system normally,
we can't just bring it up, edit the real
mdadm.conf, and regenerate
the initrd with the normal tools; instead, you need to boot the system
in a minimal mode and unpack, fix, and rebuild the initrd by hand.
First, you need some setup:
- you need to boot the system without the root mirror. Heed the cautions.
Once booted, you'll probably want to mount
/usr(read-only), if only so that you can read manpages.
- you need writeable scratch space to rebuild the initd; I mounted
- you need
/bootmounted read-write; if your
/bootis mirrored, you'll have to assemble the software RAID first with appropriate
After that, it is relatively simple:
- make a scratch directory in
/tmp(or wherever) and unpack your current initrd to it. Initrds are compressed cpio images, so this is something like:
cd /tmp/t; zcat </boot/initrd | cpio -di
- edit the now-unpacked
etc/mdadm.confto have the right
num-devicesvalue for the software RAID with your root filesystem. You don't need to update the numbers for the other software RAID devices; they aren't started in the initrd.
- reassemble the initrd. On Fedora, the 100% authentic way
to do this, exactly duplicating what
echo nash-find . | /sbin/nash --force --quiet | cpio -H newc --quiet -o | gzip -9 >/tmp/initrd
- rename your current initrd to something else as a backup,
and then copy your newly generated initrd into
/bootwith the right name.
At this point you can reboot and your system should come up as far
as mounting the root filesystem. If you have other software RAID
mirrors it will then throw a fit because none of them will successfully
assemble either, and drop you into rescue mode. To fix this, edit
/etc/mdadm.conf to specify the right
num-devices number for all
software RAID mirrors (including the root mirror). After you've
done this, reboot and things should work.
(You may need to make the root filesystem read-write first with
fsck /' followed by '
mount -o remount,rw /'.)
If you have no other software RAID mirrors, you still need to update
/etc/mdadm.conf once the system has booted or you will get to go
through this all over again after the next kernel update.
(You can see why I suspect a rescue CD would be easier. With a rescue CD
you should be able to assemble and mount the root filesystem, chroot to
it and set any other necessary filesystems up, edit
and then run
mkinitrd or the like.)
For safety, you probably want to rebuild your current initrd using
the official tools after doing the full