How modern Linux software RAID arrays are assembled on boot (and otherwise)
Here is a question that is periodically relevant: just how does a modern Linux system assemble and bring up your software RAID arrays (and other software-defined things, for that matter)?
I've written about the history of this before so I'll summarize: in the very old days the
kernel did it all for you and in the not as old days it was done by a
script in your initial ramdisk that ran
mdadm, often using an embedded
copy of your regular
The genesis of modern software RAID activation was
udev and general
support for dynamically appearing devices, including 'hotplug' disk
devices (which was and is a good thing, to be clear). When disks
can appear over time, simply running
mdadm once at some arbitrary
point is clearly not good enough. Instead the whole RAID assembly
system was changed so that every time a disk appears,
mdadm in a special 'incremental' mode. As the manpage
Add a single device to an appropriate array. If the addition of the device makes the array runnable, the array will be started. This provides a convenient interface to a hot-plug system.
A modern Linux system embeds a copy of
udev (and the important
udev rules and various supporting bits) in the initramfs and
starts it early in the initramfs boot process. The kernel feeds this
udev events corresponding to all of the hardware that has
been recognized so far and then
udev starts kicking off more or
less all of its usual processing, including handling newly appeared
disk devices and thus incrementally assembling your software RAID
arrays. Hopefully this process fully completes before you need the
(I'm not sure when and how this incremental assembly process decides
that a RAID array is ready to be started, given that ideally you'd
want all of an array's devices to be present instead of just the
minimum number. Note that the intelligence for this is in
The same general process is used to assemble and activate things like
LVM physical volumes and volume groups; as devices appear,
appropriate LVM commands to incrementally update the collection of known
physical volumes and so on and activate any that have become ready for
it. This implies that one physical disk finally appearing can cause a
cascade of subsequent events as the physical disk causes a software RAID
device to be assembled, the new RAID device is reported back to
and recognized as an LVM physical volume, and so on.
Where exactly the udev rules for all of this live varies from
distribution to distribution, so really you need to
/usr/lib/udev/rules.d) to find and read
everything that mentions
mdadm. Then you can read your
mdadm.conf manpages to see what sort of control (if any) you
can exert over this process.
The drawback of this process is that there is no longer a clear chain of scripts or the like that you can read to follow (or predict) the various actions that get taken. Instead everything is event driven and thus much harder to trace (and much less obvious, and much more split up across many different files, and so on). A modern Linux system booting is a quite complicated asynchronous environment that is built from many separate little pieces. Generally it's not well documented.
One corollary of all of this is that it is remarkably hard to have
a disk device appear and then be left alone. The moment the kernel
sends the 'new device' event to
udev (either during boot or when
the system is running),
udev will start kicking off all of its
usual processing and so on.
udevadm can be used to turn off event
processing in general but that's a rather blunt hammer (and may
have bad consequences if other important events happen during this).
For that matter you probably don't want to totally turn off processing
of the disk device's events given that
udev is also responsible for
/dev entries for newly appearing disks.