2014-09-08
What an init system needs to do in the abstract
I've talked before about what init
does historically, but that's not the same thing as what an
init system actually needs to do, considered abstractly and divorced
from the historical paths that got us here and still influence how
we think about init systems. So, what does a modern init system in
a modern Unix need to do?
At the abstract level, I think a modern init system has three jobs:
- Being the central process on the system. This is both the modest
job of being PID 1 (inheriting parentless processes and reaping
them when they die) and the larger, more important job of supervising
and (re)starting any other components of the init system.
- Starting and stopping the system, and also transitioning it
between system states like single user and multiuser. The second
job has diminished in importance over the years; in practice most
systems today almost never transition between runlevels or the
equivalent except to boot or reboot.
(At one point people tried to draw a runlevel distinction between 'multiuser without networking' and 'multiuser with networking' and maybe 'console text logins' and 'graphical logins with X running' but today those distinctions are mostly created by stopping and starting daemons, perhaps abstracted through high level labels for collections of daemons.)
- Supervising (daemon) processes to start, stop, and restart them on
demand or need or whatever. This was once a sideline but has
become the major practical activity of an init system and why
people spend most of the time interacting with it. Today this
encompasses both regular
getty
processes (which die and restart regularly) and a whole collection of daemons (which are often not expected to die and may not be restarted automatically if they do).You can split this job into two sorts of daemons, infrastructure processes that must be started in order for the core system to operate (and for other daemons to run sensibly) and service processes that ultimately just provide services to people using the machine. Service processes are often simpler to start, restart, and manage than infrastructure processes.
In practice modern Unixes often add a fourth job, that of managing the appearance and disappearance of devices. This job is not strictly part of init but it is inextricably intertwined with at least booting the system (and sometimes shutting it down) and in a dependency-based init system it will often strongly influence what jobs/processes can be started or must be stopped at any given time (eg you start network configuration when the network device appears, you start filesystem mounts when devices appear, and so on).
The first job mostly or entirely requires being PID 1; at a minimum your PID 1 has to inherit and reap orphans. Since stopping and starting daemons and processes in general is a large part of booting and rebooting, the second and third jobs are closely intertwined in practice although you could in theory split them apart and that might simplify each side. The fourth job is historically managed by separate tools but often talks with the init system as a whole because it's a core dependency of the second and third jobs.
(Booting and rebooting is often two conceptually separate steps in that first you check filesystems and do other initial system setup then you start a whole bunch of daemons (and in shutdown you stop a bunch of daemons and then tear down core OS bits). If you do this split, you might want to transfer responsibility for infrastructure daemons to the second job.)
The Unix world has multiple existence proofs that all of these roles do not have to be embedded in a single PID 1 process and program. In particular there is a long history of (better) daemon supervision tools that people can and do use as replacements for their native init system's tools for this (often just for service daemons), and as I've mentioned Solaris's SMF splits the second and third role out into a cascade of additional programs.