A clever way of killing groups of processes
While reading parts of the systemd source code that handle late stage shutdown, I ran across an oddity in the code that's used to kill all remaining processes. A simplified version of the code looks like this:
void broadcast_signal(int sig, [...]) { [...] kill(-1, SIGSTOP); killall(sig, pids, send_sighup); kill(-1, SIGCONT); [...] }
(I've removed error checking and some other things; you can see the original here.)
This is called to send signals like SIGTERM
and SIGKILL
to
everything. At first the use of SIGSTOP
and SIGCONT
puzzled me,
and I wondered if there was some special behavior in Linux if you
SIGTERM
'd a SIGSTOP
'd process. Then
the penny dropped; by SIGSTOP
ing processes first, we're avoiding
any thundering herd problems when processes start dying.
Even if you use kill(-1, <signal>)
, the kernel doesn't necessarily
guarantee that all processes will receive the signal at once before
any of them are scheduled. So imagine you have a shell pipeline
that's remained intact all the way into late-stage shutdown, and
all of the processes involved in it are blocked:
proc1 | proc2 | proc3 | proc4 | proc5
It's perfectly valid for the kernel to deliver a SIGTERM
to
proc1
, immediately kill the process because it has no signal
handler, close proc1
's standard output pipe as part of process
termination, and then wake up proc2
because now its standard input
has hit end-of-file, even though either you or the kernel will very
soon send proc2
its own SIGTERM
signal that will cause it to
die in turn. This and similar cases, such as a parent waiting for
children to exit, can easily lead to highly unproductive system
thrashing as processes are woken
up unnecessarily. And if a process has a SIGTERM
signal handler,
the kernel will of course schedule it to wake up and may start it
running immediately, especially on a multi-core system.
Sending everyone a SIGSTOP
before the real signal completely
avoids this. With all processes suspended, all of them will get
your signal before any of them can wake up from other causes. If
they're going to die from the signal, they'll die on the spot;
they're not going to die (because you're starting with SIGTERM
or SIGHUP
and they block or handle it), they'll only get woken
up at the end, after most of the dust has settled. It's a great
solution to a subtle issue.
(If you're sending SIGKILL
to everyone, most or all of them will
never wake up; they'll all be terminated unless something terrible
has gone wrong. This means this SIGSTOP
trick avoids ever having
any of the processes run; you freeze them all and then they die
quietly. This is exactly what you want to happen at the end of
system shutdown.)
Comments on this page:
|
|