2014-03-16
You don't have to reboot the system if init dies
One of the thing that makes PID 1 special on many systems is that if it ever exits or dies for any reason, the system will reboot. This behavior was introduced by BSD Unix (V7 ignored the possibility) and makes a certain amount of sense; init is crucial both for reaping orphan processes and restarting serial port logins. If it goes away, rebooting the system is an easy way to hopefully fix the situation.
However, this behavior is not set in stone. There are several
alternatives. The first would be to simply have the kernel cope with
no PID 1, handling and reaping orphan processes itself internally in
some way (and possibly providing some special way for user level to
restart a new PID 1). The second is for the kernel to re-exec init as
PID 1 if necessary. If PID 1 exits, the kernel would not tear down its
process but instead act as if it had done an exec
. Ideally this
would be accompanied by some way for init to store and then reload
important state. Done right this actually provides a great way for init
to transition itself into a new version; just record the current state,
exit, and let the kernel re-exec the new init
binary.
Perhaps the second behavior sounds odd and crazy. Then I should probably tell you that this is current Solaris behavior and nothing seems to have exploded as a result. In other words we already have an existence proof that it's possible to change the semantics of PID 1 exiting, so we could adopt it elsewhere if desired.
Apart from the innate conservatism of Unixes, I think one reason that other Unixes haven't done this is that it's almost never necessary anyways. Since init not exiting is so crucial today people have devoted a lot of engineering effort to make sure that it doesn't happen and have been quite successful at it. Even radically different and complex systems like Upstart and systemd have been extremely stable this way in practice.
(Also, this 're-exec init on failure' behavior needs cooperation from your init, both so that init doesn't always start trying to boot the system when it's executed and so that it journals state periodically so that a new init can pick it up again. This makes it easier to add in certain sorts of Unixes, ie the ones where one team can control both kernel changes and init changes.)