Booting a Linux system without a root mirror

May 13, 2009

Suppose that you have a system with a mirrored root filesystem and, for some reason, you need to boot it without the mirror being available. At one level, this is very easy; all you need to do is to find the partition of one of the mirrors and specify it as the filesystem root device on the kernel command line with 'root=/dev/sdXX'.

This works because while software RAID embeds its own metadata in the partitions, it puts the metadata at the end of the partition, not the start. So if you look at the partition without RAID, what you see is a normal filesystem (that happens to not be using all of the available space).

However, doing this comes with a huge warning. If you ever write to a filesystem accessed this way, you will have desynchronized the two sides of your mirror behind Linux's back; they will have different data when software RAID counts on them being exactly the same. What happens when the mirror is next reassembled ranges from unpredictable to explosive.

Therefor, you need to also boot your system in what I will call 'utterly single user mode', by supplying 'ro init=/bin/sh' as additional kernel arguments. It is not good enough to use 'single', as booting in normal single user mode writes things to the root filesystem. Also, if your root filesystem uses ext3 you should be very sure that your system shut down cleanly, because ext3 will write to the filesystem to replay the log even when you mount it read-only.

Booted this way, the system will then come up with /bin/sh running as the init process, PID 1, and the root filesystem mounted read-only. You can then mount other intact filesystems read-write, or just poke around. If you need scratch space, you can mount a tmpfs filesystem somewhere (such as /tmp).

If you actually need to write to the root filesystem, well, I think you get to break out a rescue CD of some sort and figure out how to assemble your mirror.

(There might be tricks you can play with mdadm to force the other side of the mirror to be seen as out of date, but I can't see anything obvious in the manpage. Maybe you could zero the RAID superblock on the other device and then re-add it later.)

Written on 13 May 2009.
« A serious gotcha with growing software RAID devices
Fixing your system after hitting the RAID growth gotcha »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed May 13 00:53:25 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.