== Ubuntu 12.04 can't reliably boot with software RAID (and why) Recently one of my co-workers discovered, diagnosed, and worked around a significant issue with software RAID on Ubuntu 12.04. I'm writing it up here partly to get it all straight in my head and partly so we can help out anyone else with the same problem. The quick summary of the situation comes from [[my tweet https://twitter.com/thatcks/status/226355211260813312]]: > ~~Ubuntu 12.04 will not reliably boot a system with software RAID > arrays due to races in the initramfs scripts.~~ (As you might guess, [[I am not happy https://twitter.com/thatcks/status/226355253329661952]].) If you set up Ubuntu 12.04 with one or more software RAID arrays for things other than the root filesystem, you will almost certainly find that some of the time when you reboot your system it will come up with one or more software RAID arrays in a degraded state with one or more component devices not added to the array. If you have set _bootdegraded=true_ as one of your boot options (eg on [[the kernel command line KernelCmdlineProcessing]]), your system will boot fully (and you can hot-add omitted device back to the array); if you haven't, the initramfs will pause briefly to ask you if you want to continue booting anyways, time out on the question, and drop you into an initramfs shell. This can happen whether or not your root filesystem is on a software RAID array (although it doesn't happen to the root array itself, only to other arrays) and even if you do not have the software RAID arrays configured or used in your system in any way (not listed in _/etc/mdadm/mdadm.conf_, not used in _/etc/fstab_ and so on); simply having software RAID arrays on a disk attached to your system at boot time is enough to trigger the problem. It doesn't require disks that are slow to respond to the kernel (to the extent that we've reproduced this in VMWare, where the disks aren't even physical and respond to kernel probes basically instantly). Now let's talk about how this happens. Like other modern systems Ubuntu 12.04 handles device discovery with _udev_, even during early boot in the initramfs. Part of udev's device discovery is the assembly of RAID arrays from components. What this means is that ~~software RAID assembly is asynchronous~~; the initramfs starts the udev daemon, the daemon ends up with a list of events to process, and as it works through them the software RAID arrays start to appear. In the mean time the rest of the initramfs boot process continues on and in short order sets itself up to mount the root filesystem. As part of preparing to mount the root filesystem, the initramfs code then checks to see if all visible arrays are fully assembled and healthy *without waiting for udev to have processed all pending events*. You know, the events that can include incrementally assembling those arrays. This is a race. If udev wins the race and fully assembles all visible software RAID arrays before the rest of the initramfs checks them, you win and your system boots. If udev loses the race, you lose; the check for degraded software RAID arrays will see some partially assembled arrays and throw up its hands. Our brute force solution is to modify the check for degraded software RAID arrays to explicitly wait for the udev event queue to drain by running '_udevadm settle_'. This appears to work so far but we haven't extensively tested it; it's possible that there's still a race present but it's now small enough that we haven't managed to hit it yet. This is unquestionably an Ubuntu bug and I hope that it will be fixed in some future update. === Sidebar: our fix in specific (For the benefit of anyone with this problem who's doing Internet searches.) Change _/usr/share/initramfs-tools/scripts/mdadm-functions_ as follows: .pn prewrap on degraded_arrays() { + udevadm settle mdadm --misc --scan --detail --test >/dev/null 2>&1 return $((! $?)) } Then rebuild your current initramfs by running '_update-initramfs -u_'. Since I suspect that _mdadm-functions_ is not considered a configuration file, you may want to put a dpkg hold on the Ubuntu _mdadm_ package so that an automatic upgrade doesn't wipe out your change. (This may not be the best and most Ubuntu-correct solution. It's just what we've done and tested right now.) === Sidebar: where the bits of this are on 12.04 * _/lib/udev/rules.d/85-mdadm.rules_: the udev rule to incrementally assemble software RAID arrays as components become available. Various parts of the initramfs boot process are found (on a running system) in _/usr/share/initramfs-tools/scripts_: * _init-top/udev_: the scriptlet that starts udev. * _local-premount/mdadm_: the scriptlet that checks for all arrays being good; however, it just runs some functions from the next bit. (All of _local-premount_ is run by the _local_ scriptlet, which is run by the initramfs _/init_ if the system is booting from a local disk.) * _mdadm-functions_: the code that does all the work of checking and 'handling' incomplete software RAID arrays. Looking at this, I suspect that a better solution is to stick our own script in _local-premount_, arranged to run before the _mdadm_ script, and have it run the '_udevadm settle_'. That would avoid changing any package-supplied scripts. (Testing has shown that creating a _local-top/mdadm-settle_ scriptlet isn't good enough. It gets run, but too early. This probably means that modifying the ((degraded_arrays)) function is the most reliable solution since it happens the closest to the actual check, and we just get to live with modifying a package-supplied file and so on.)