A linear, sequential boot and startup order is easier to deal with

November 12, 2021

By and large, the people who develop Unixes have had a long standing interest in faster boot times. Fast boot times look and sound good, and because parallelizing actions during boot is one obvious way to speed it up, these faster boot times often come along with a more sophisticated view of service dependencies, which is useful at other times. Although Linux's systemd is the current poster child of this effort, it predates systemd; predecessors include Upstart (on Linux), various reinvented service management frameworks, and Solaris's SMF. There is only one small issue with all of this enthusiasm, one that is unfortunately often ignored or downplayed, and it is that a linear startup order is often easier to deal with.

A linear order is straightforward to see, understand, reason about, and generally to manipulate. It's easy to know what order things will happen in and have happened in, which avoids surprises during boot and helps diagnose problems afterward; you're much less likely to be left trying to sort out what happened when from boot time logs. It's nice to to understand the dependencies of services when that information is reliable, but we have a great deal of evidence that taxonomy is hard for people, and dependencies are a form of taxonomy. When dependencies are inaccurate, they can be worse than knowing that you don't know that information in the first place.

A strictly linear and sequential boot order is not as fast as a parallelized boot. But it's a lot easier to set up, verify, and troubleshoot when something goes wrong. These are much less obvious virtues than a fast boot and can't be benchmarked or reduced to a simple number for comparison, but they are virtues.

(As a final strike against it, a linear, sequential boot system is boring. There is very little exciting development work to be done on it, and if you want to speed it up you're probably going to have a long boring slog of finding little delays here and there and trying to persuade people to convert shell scripts into programs.)

I like Linux's systemd in general, but I do miss having a linear, sequential boot order. It made my life easier, even if the common implementation of it in System V init style systems left a number of things to be desired.

(For instance, System V init could often make it difficult to re-order or insert services, and especially to manage this over time, package upgrades, and so on. Systemd's concept of drop-ins is quite valuable.)


Comments on this page:

I wrote my own, tiny init once. It was great -- stupid fast (better than every other system I've used), surprisingly easier than I thought (most of the difficulty was just in mounting stuff) and I got to choose which services/daemons/things needed to be started sequentially or backgrounded.

I now have a few interesting questions for the init worlds:

Why are gettys & graphical logins not started earlier?

I started my gettys as soon as all filesystems were mounted, ie really super early. I didn't need to wait for network interfaces, dhcp, cups, wpa_supplicant, etc.

This clearly won't work if you have a networked-login setup (like you probably have Chris). For those setups you definitely need it to wait until at least a default route is available. On a tangential note: I wonder if most login/auth/etc setups (PAM?) are smart enough to freeze/wait until they can contact their hosts, or if they just immediately error out.

I think there are a few other reasons why most distos start gettys & graphical login systems absolutely last:

(1) Popular graphical login managers probably depend on lots of services, like system-wide dbus, colour-management daemons and the likes. They could be changed to avoid this, but that's unlikely to be a priority in the popular ones (over adding more features).

(2) On HDD based systems: logging in whilst system services were still still starting up could feel like molasses. Do you remember what logging into XP systems was sometimes like? It was often better to wait a few minutes after booting before logging in, and then a few minutes after logging in before launching anything.

(3) The "what do I do with tty1" problem. Do I keep using it for init/service output, or start a getty on it? Or both and hope the user doesn't mind not seeing their input?

For this last item I cheated, I only stared gettys on tty2 onwards. That's not what most users expect. If you had a graphical login manager however it would probably be perfectly fine to skip a getty on tty1 (as Xorg based login managers don't clobber or need your console text).

Would it be possible to provide an init+service manager with a manually configured sequence list?

Currently inits tend to work like this:

(1) User runs commands (or adds/deletes symlinks) to "add" and "remove" services from startup.

(2) System then tries to automagically sequence starting them in the right order.

Would it be possible to instead do this:

(1) User edits a plaintext file that lists the daemons in order, with explicit instructions to wait for specific things. Eg:

do load_modules
do mount_all
do remount_root # Depends on your environment
waitfor network_interfaces 30sec
start dhcpcd
start crond
start dbus-system
start haveged
waitfor network_ipaddr 30sec
waitfor service_dbus_up 30sec
start gettys
start graphicloginmanager

"Do" lines would be run sequentially, "start" lines would run things in parallel and "waitfor" lines would block all execution until the condition is met (or the optional timeout is hit).

Come to think of it, this basically was how I ran my init, but as a raw shell script instead of some fancy syntax. The "do" were just linear lines of code. The "waitfors" where just loops checking a condition, sleeping 1 second if it fails. The "Starts" were just launching daemons (in a hands off '&' backgrounded approach).

A lot of the complexity and unintended misbehaviors of modern init systems seem to stem from their desire to automate everything, which is not a task that can ever be completely perfected (due to the changing natures of daemons). This automation is very attractive from a distro developer point of view (people can just install packages and everything mostly still works). Getting users to sequence startup themselves can lead to effective results, but that's too complex and niche to expect all of your users to be able to handle.

I wonder if it would be possible for systemd to provide a debugging mode where there is no parallelism at boot and shutdown times ?

By Walex at 2021-11-13 09:06:53:

Easier to understand and profile and debug is a big deal, and there is an ancient article by P Denning about that: https://dl.acm.org/doi/10.1145/1041613.1041615

As to boot, in a previous comment I mentioned that the traditional UNIX things was the compromise of not a strictly sequential order, but a sequential order of "phases" (a.k.a. "runlevels").

The problem with that is large servers with complex configurations, and ironically that is where the 'systemd' etc. approach of boot parallelism and "pull" setup is most useful: systems with dozens of devices, dozens of service daemons, lots of network resources. Both because fully initializing them takes a long time, and the dependencies are really complex to linearize. To some extent that problem is also addressed by "systemd for microservices" frameworks like Kubernetes,

But on a general level, a bunch of clever clowns have already made the GNU/Linux boot insufferably complex, and then have proceeded to simplify it the wrong way:

  • The servers boots the EFI OS and shell, mounting the EFI root, and one the EFI OS is fully setup, it runs a script that starts the GRUB phase 1 bootloader.
  • GRUB phase 1 loads from somewhere the 1.5 stage, which is the GRUB kernel, and that loads the GRUB phase 2, the GRUB shell, that then loads the GRUB kernel modules. GRUB is on OS whose kernel has a couple hundred modules.
  • Once the GRUB OS is fully setup, it runs a script that loads the GNU/Linux 'initramfs' kernel and root filesystem, and mounts that and initializes it, the modules and the shell (usually BusyBox or SASH).
  • A script in the Linux 'initramfs' OS then finds the GNU/Linux OS root, and switches to that and initialized it.

The typical GNU/Linux boot involves therefore four operating systems, and is very fragile, leading many GNU/Linux users to adopt the ancient and traditional MS-Windows strategy of "reinstall from scratch" in case of boo issues.

BTW the wrong way to simplify this has been to merge the '/' and '/usr/' parts of the GNU/Linux 4th OS, when instead the "obvious" thing to do would be to eliminate the usually unnecessary 'initramfs' OS, which is just a special case of what was '/' under UNIX.

Given that complexity, the further parallelism complexity of 'systemd' or Upstart etc. is somewhat less complex and fragile, because at least it happens in a full-functionality UNIX environment.

I just wish that 'systemd' had been written around states and events rather than tasks, and were less monolithic. Or that the default Upstart rules had been written pull-wise instead of push-wise (in which case Upstart may have survived). Plus neither 'init' systems cover well the common case of relatively tightly related not not single-image server complexes.

By cks at 2021-11-13 18:22:32:

The BSD init system did not have a concept of runlevels and always booted more or less sequentially. As for System V init and its concept of runlevels, your description does not match my memories. Within a runlevel, System V init was normally sequential and linear, and it did not start multiple runlevels at once. A System V system booting normally ran everything in a special early boot runlevel sequentially, and then ran everything sequentially in either runlevel 3 or runlevel 5, depending.

The early boot runlevel did very basic core things like fsck filesystems. Some vendors may have parallelized portions of this within the scripts involved, but they were always closed, vendor specific black boxes; system administrators did not add things to the early boot runlevel unless you enjoyed explosions. For the purposes of diagnosing boot problems, either the entire early boot runlevel succeeded and you proceeded on to the understandable process of runlevel 3 (or 5), or it failed and you tried to figure out what had gone bad in your disks and core filesystems.

(I happen to have a very old Linux root filesystem handy, so I can say that in it, the early boot process is not even a runlevel; it's a special 'sysinit' ('si') action in /etc/inittab that just runs a script. The system proceeded through sysinit and then ran all of one runlevel's scripts in sequence.)

«The BSD init system did not have a concept of runlevels and always booted more or less sequentially.»

The BSD init was pretty much the same as the UNIX init and it had at least three phases:

  • Early boot (mostly devices).
  • Single user mode (mostly filesystems).
  • Local multi user mode (mostly local daemons).

To which "Networked multi user mode" (mostly network daemons) was eventually added, sometimes between "single" and "multi", sometimes after "multi",

The logic was that each "phase" could rely on the preceding phase being fully completed (whether sequential or parallel) before it, therefore there was no need of fine grained dependencies.

«As for System V init and its concept of runlevels, your description does not match my memories.»

Most people's memories of System V 'init' are fuzzy, because it was misunderstood quite a lot. That 'init' was driven by 'inittab', and was in essence a process supervisor with a number of "run states" which were misnamed "runlevels" for similarity with the UNIX/BSD runlevels. The peculiarity of these states was that unlike the UNIX/BSD runlevels, which were organized "hierarchically" ("early" => "single" => "multi" => "network"), states could be switched from and to arbitrarily, so state 1 could be followed by state 5 then state 3 then state 4 (but note the special 'boot', 'bootwait', 'sysinit' actions that define early "phases").

«Within a runlevel, System V init was normally sequential and linear»

The usual 'inittab' action for daemons was 'once' or 'respawn', to launch background processes in parallel. Only the 'bootwait' and 'wait' actions involved sequential starting, and were mostly used for early phases.

http://www-it.desy.de/cgi-bin/man-cgi?telinit+1 http://www-it.desy.de/cgi-bin/man-cgi?inittab+5

Some of the misconceptions about System V 'init' come from its default configuration in most GNU/Linux distributions to simulate the UNIX/BSD hierarchical runlevel model, with later scripts (themselves often sequential) launched with 'wait', as in:

https://www.cyberciti.biz/howto/question/man/inittab-man-page.php

   # Runlevel 0 is halt.
   # Runlevel 1 is single-user.
   # Runlevels 2-5 are multi-user.
   # Runlevel 6 is reboot.

   l0:0:wait:/etc/init.d/rc 0
   l1:1:wait:/etc/init.d/rc 1
   l2:2:wait:/etc/init.d/rc 2
   l3:3:wait:/etc/init.d/rc 3
   l4:4:wait:/etc/init.d/rc 4
   l5:5:wait:/etc/init.d/rc 5
   l6:6:wait:/etc/init.d/rc 6

For a different more UNIXy version:

https://books.google.de/books?id=0RUlaBWtFdUC&pg=PA298

   init:3:initdefault:
   ioin: :sysinit:/sbin/ioinitrc >/dev/console 2>&1
   tape: :sysinit:/sbin/mtinit > /dev/console 2>&1
   muxi::eysinit:/sbin/dasetup </dev/console »/dev/console 2>&1 # mux init
   stty::sysinit:/sbin/stty 9600 clocal icanon echo opost onlcr ixon icrnl ignpar </dev/systty
   brcl::bootwait:/sbin/bcheckrc </dev/console >/dev/console 2>&1 # fsck, etc.
   link::wait:/sbin/sh -c "/sbin/rm -£ /dev/syscon; \
   /sbin/in /dev/systty /dev/syscon" >>/dev/console 2>&1
   cprt::bootwait:/sbin/cat /etc/copyright >/dev/syscon   # legal req
   sqnc::wait:/sbin/rc </dev/console >/dev/console 2>&1   # init
   #powf::powerwait:/sbin/powerfail >/dev/console 2>&1    # powerfail
   cons:123456:respawn:/usr/sbin/getty console console    # system console
   #ttpl:234:respawn:/usr/sbin/getty -h tty0pl 9600
   #ttp2:234:respawn:/usr/sbin/getty -h ttyOp2 9600
   #ttp3:234:respawn:/usr/sbin/getty -h ttyOp3 9600
   #ttp4:234:respawn:/usr/sbin/getty -h ttyOp4 9600
   #ttp5:234:respawn:/usr/sbin/getty -h tryOpS 9600
   #ups::respawn:rtprio 0 /usr/lbin/ups_mond -f /etc/ups_conf

The "native" System V 'inittab' was arguably nicer than its hybridization with the early UNIX/BSD "runelevel" scripts, but it had the usual great flaw, that it supervised processes without a service state model; plus that is had the usual "phase" approach applied to "run states", instead of more fine grained dependencies among services.

By dozzie at 2021-11-14 07:49:40:

Frankly, prallel booting of a server was always a dumb idea. We're shaving off some 15 seconds from a process that takes 45 seconds, but the whole machine boots in 5+ minutes because of POST and various BIOS-es (NIC, RAID drive, etc.), and we're paying for it with predictability.

By Walex at 2021-11-14 14:07:13:

«Frankly, parallel booting of a server was always a dumb idea.»

For the vast majority of servers as in 1U-4U servers with web hosting or similar workloads indeed, but not literally “always”.

Some people have servers with hundreds of disk drives, for example, or complicates topologies, network setups, application dependencies, huge filesystem instances, especially rich "enterprise" people, and they can take a long time to fully initialize, so getting to at least a partial live state quickly may be rather useful, never mind parallelizing.

«but the whole machine boots in 5+ minutes because of POST and various BIOS-es (NIC, RAID drive, etc.)»

Oh that is so annoying. Then someone could argue that a potential solution would be to run 'systemd' in the BIOS, in the ILOM/iDRAC/IPMI thingie, and even in the attached network switch etc., ideally up to the dam turbines supplying the power to the site, so system startup can trigger the setup, if needed, of all upstream infrastructure dependencies it needs, all in parallel of course. ;-)

By Walex at 2021-11-14 14:37:58:

«so system startup can trigger the setup, if needed, of all upstream infrastructure dependencies it needs»

This relates to there being are two distinct reasons why 'systemd' tends to absorb everything else, one good and one bad:

  • The good ambition of managing all dependencies, so starting say 'apache' triggers bringing up a container which triggers building a Docker image which triggers compiling the Apache sources, which triggers downloading them which triggers mounting the filesystems and bringing up the network etc. etc. etc.
  • The bad decision to manage dependencies among tasks rather than events or (better) states, so 'systemd' or plugins/extensions integrate directly by 'systemd' end up doing those tasks.

Also I have realized that in talking of "phases" and parallelism I was not quite clear so perhaps there have been misunderstandings:

  • Dependencies and parallelism are distinct concepts: expressing dependencies allows but does not mandate parallelism. A service supervisor can use dependencies to either serialize (if possible) or parallelize service startup and shutdown, In particular, textual sequential startup is not to be confused with time-wise sequential startup.
  • Dependencies can be fine grained, from one service to another or organized in coarse "phases", where each service is assigned to a "phase" and it is guaranteed that all services in one phase are initialized before all services in the next "phase", regardless of whether the services in any one phase are started in parallel or sequentially.
Written on 12 November 2021.
« I'm unsure of the security of simultaneous multithreading on modern x86 CPUs
Considering "iowait" CPU time and CPU utilization »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Nov 12 23:47:49 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.