Why a (Linux) service delaying its shutdown is a bad thing
Over on Twitter, I said something:
Every Linux daemon that refuses to stop during a reboot for "good reasons" needs to understand that it's delaying the system's return to service by a minute and a half (systemd's timeout) or more. When that's my desktop, I get quite angry with the daemon. Hi, PackageKit.
When I type '
reboot' (or invoke the equivalent in a GUI), my
machine immediately goes out of service. My desktop session ends
if there is one, I and everyone else on the machine get logged off,
daemons start dying left and right, et cetera. The machine will
only come back into service when it completes both the shutdown and
the boot that follows it. One part of the delay is how fast the
machine boots. The other part is how fast the machine shuts down.
People pay a lot of attention to how long it takes to boot a system.
They pay much less attention to how long it takes to shut one down,
despite this often being a good portion of the practical return to
One of the things that goes wrong in shutdown on systemd based
systems is when some daemon (more generally, some service or even
a session) refuses to shut down immediately. On systemd based
systems, things that don't shut down trigger what is by default a
90 second timeout (this is system.conf's
As covered in
systemd will wait this long before forcefully killing the service's
processes and letting the reboot continue. In other words, the reboot
takes an extra minute and a half (at least), so your machine is out of
service for an extra minute and a half (at least).
At one level this is not really systemd's fault. Systemd is not causing
the service to be slow to stop; instead, systemd is unusual in init
systems in that it actually checks to see if the service really has
stopped. In the old System V style Linux init system,
init ran each
/etc/init.d/<whatever> script with a '
stop' argument and assumed
that when the script exited, the service had shut down. If this wasn't
init mostly didn't stop to notice; when the system was
actually rebooted, those remaining processes generally got terminated
very abruptly by the kernel. People did notice and complain about init
scripts that had slow 'stop' actions (and so those mostly got fixed),
but they didn't notice lingering processes.
(When you tell the Linux kernel to reboot, it takes you at your word.)
Services that take more than a few seconds to shut themselves down, especially in ordinary operation, have a bug. One reason this is a bug is that there's absolutely no guarantee that you have very much time before the system as a whole goes down, for example because the UPS battery power is about to run out. Plus, the systemd timeout can be set to much lower (and some people do), so your processes can be abruptly terminated after short times even in ordinary circumstances. And slow service shutdowns delay the system's return to service (and leave people drumming their fingers, unable to do anything with their machine because it has functionally hung).
(I've written about this general shutdown delay issue before in SystemdRebootIrritation, but that wasn't focused on badly behaved services.)