Groups of processes are a frequent and fundamental thing in Unix

October 22, 2019

Recently, I wrote about a gotcha when catching Control-C in programs that are run from scripts, where things could go wrong because the Control-C was delivered not just to the program but also to the shell script, which wasn't expecting it (while the program was). From the way I wrote that entry (which focused on a gotcha involving this group signalling behavior), you might wind up with the impression that this behavior of Unix signals is a wart in Unix. In fact, it's not; that signals from things like Control-C behave this way is an important part of Unix shell usability.

The core reason for this is that in Unix, it's very common for a group of processes to be one entity as far as you're concerned. Unix likes processes and it likes assembling things out of groups and trees of processes, and so you wind up with what people think of as one entity that is actually composed of multiple processes. When you do things like type a Control-C, you almost always want to operate on the entity as a whole, not any specific process in it, and so Unix supports this by sending terminal signals to its best guess at the group of processes that are one thing.

That sounds pretty abstract, so let's make it concrete. One simple case of a group of processes acting as one entity is the shell pipeline:

$ prog1 <somefile | prog2 | prog3 | prog4

If you type a Control-C, almost everyone wants the entire pipeline to be interrupted and exit. It's not sufficient for the kernel to just signal one process, let it exit, and hope that this causes all of the other ones to hit pipe IO errors, because one of those programs (say prog2) could be engaged in a long, slow computation before it reads or writes to a pipe.

(As a sysadmin, one of my common cases here is 'fgrep some-pattern big-file | tail -10', and then if it takes too long I get impatient and Ctrl-C the whole thing.)

Shell scripts are another obvious case; since the shell is such a relatively limited language, almost all shell scripts run plenty of external programs even when they're not using pipes. That creates at least two processes (the shell script and the external program), and again when you Ctrl-C the command you want both of them to be interrupted.

A final common case for a certain sort of person is running make. Especially for large programs, a make run can create quite deep trees of processes (and go through quite a lot of them). And again, if you Ctrl-C your make, you want everything to be interrupted (and promptly).

(Unix could delegate this responsibility to some single process in this situation, such as the master process for a shell script or make itself. But for much the same reason that basic terminal line editing belongs in the kernel, Unix opts to have the kernel do it.)

Written on 22 October 2019.
« Filesystem size limits and the complication of when errors are detected
The DBus daemon and out of memory conditions (and systemd) »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Oct 22 23:39:41 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.