2008-10-30
How Linux initrds used to be a hack
The traditional way that Unix kernels boot is to perform some very basic
system setup, declare the running code to be a kernel process with PID 1,
initialize all of the various important subsystems (such as networking),
initialize drivers, mount the root filesystem from whatever device you
told them was the root device, and then just exec() (from inside the
kernel) /sbin/init. At least in the somewhat old days, Linux was no
exception to this.
The problem for initrds is that the job of initrds is to do things
(such as load driver modules) to let the system see the normal root
filesystem, and a normal Unix system (and Linux was no exception)
provides no way to change what the root filesystem is once init has
started running. Thus you cannot implement initrds in the obvious way,
by having the kernel treat the ramdisk as the root filesystem and
exec()ing a program from it as init.
Instead, the original Linux initrd hooked into the boot process just
before the 'mount the root filesystem' step. If there was an initrd, the
kernel diverted sideways; it mounted the initrd, created a new process,
and in the new process exec()'d /linuxrc. When this process exited,
the kernel resumed the normal process of booting, expecting that the
configured root device was present and so on. (I believe that the initrd
could use a magic hack to change what the kernel had set as the root
device.)
This magic diversion had some odd consequences. For example, since the
kernel was PID 1, it had to arrange to reap orphaned processes while
it was waiting for the initrd to finish running, because otherwise
they could just pile up. Related to this, you couldn't populate your
initrd with a normal but minimal system and just run from it since most
versions of init become unhappy if they are not PID 1.
(As you might guess, the fix to make initrds not a hack was to add a way to change what the root filesystem is on a running system.)