Some uses for SIGSTOP
and some cautions
If you ask, many people will tell you that Unix doesn't have a
general mechanism for suspending processes and later resuming them.
These people are correct in general, but sometimes you can cheat
and get away with a good enough substitute. That substitute is
SIGSTOP
, which is at the core of job control.
Although processes can catch and react to other job control signals, SIGSTOP
is a non-blockable signal like
SIGKILL
(aka 'kill -9
'). When a process is sent it, the kernel
stops the process on the spot and suspends it until the process
gets a SIGCONT
(more or less). You can thus pause processes and
continue them by manually sending them SIGSTOP
and SIGCONT
as
appropriate and desired.
(Since it's a regular signal, you can use a number of standard
mechanisms to send SIGSTOP
to an entire process group or all of
a user's processes at once.)
There are any number of uses for this. Do you have too many processes banging away on the disk (or just think you might)? You can stop some of them for a while. Is a process saturating your limited network bandwidth? Pause it while you get a word in edgewise. And so on. Basically this is more or less job control for relatively arbitrary user processes, as you might expect.
Unfortunately there are some cautions and limitations attached to
use of SIGSTOP
on arbitrary processes. The first one is
straightforward: if you SIGSTOP
something that is talking to the
network or to other processes, its connections may break if you
leave it stopped too long. The other processes don't magically know
that the first process has been suspended and so they should let
it be, and many of them will have limits on how much data they'll
queue up or how long they'll wait for responses and the like. Hit
the limits and they'll assume something has gone wrong and cut your
suspended process off.
(The good news is that it will be application processes that do
this, and only if they go out of their way to have timeouts and
other limits. The kernel is perfectly happy to leave things be for
however long you want to wait before a SIGCONT
.)
The other issue is that some processes will detect and react to one
of their children being hit with a SIGSTOP
. They may SIGCONT
the child or they may kill the process outright; in either case
it's probably not what you wanted to happen. Generally you're safest
when the parent of the process you want to pause is something simple,
like a shell script. In particular, init
(PID 1) is historically
somewhat touchy about SIGSTOP
'd processes and may often either
SIGCONT
them or kill them rather than leave them be. This is
especially likely if init
inherits a SIGSTOP
'd process because
its original parent process died.
(This is actually relatively sensible behavior to avoid init
having a slowly growing flock of orphaned SIGSTOP
'd processes
hanging around.)
These issues, especially the second, are why I say that SIGSTOP
is not a general mechanism for suspending processes. It's a mechanism
and on one level it always works, but the problem is the potential
side effects and aftereffects. You can't just SIGSTOP
an arbitrary
process and be confident that it will still be there to be continued
ten minutes later (much less over longer time intervals). Sometimes
or often you'll get away with it but every so often you won't.
|
|