How job control made the SIGCHLD signal useful for (BSD) Unix

January 3, 2020

On modern versions of Unix, the SIGCHLD signal is sent to a process when one of its child processes terminates or some other status changes happen. Catching SIGCHLD is a handy way to find out about child processes exiting while your program is doing other things, and ignoring it will make them vanish instead of turning into zombies. Despite all of these useful things, SIGCHLD was not in V7 Unix; it was added in 4.2 BSD and independently in System III (as SIGCLD).

In V7, programs like the shell had fairly straightforward handling of waiting for children that was basically synchronous. When they ran something (or you ran the 'wait' shell builtin), they called wait() until either it returned an error or possibly the process ID they were looking for came up. If you wanted to interrupt this, you used ^C and the shell's signal handler for SIGINT did some magic. This was sufficient in V7 because the V7 shell didn't really need to know about changes in child status except when you asked for it.

When BSD added job control, it also made it so that background programs that tried to do terminal input or output could be automatically suspended. This is necessary when programs can move between being foreground programs and background ones; you might start a program in the foreground, background it when it takes too long to respond, and then want to foreground it again once it's ready to talk to you. However, adding this feature means that a job control shell now needs to know about process state changes asynchronously. If the shell is waiting at the shell prompt for you to enter the next command (ie, reading from the terminal itself and blocked in read()) and a background program gets suspended for terminal activity, you probably want the shell to tell you about this right away, not when you next provide a line of input to the shell. To get this immediate notification, you need a signal that's sent when child processes change their status, including when they get suspended this way. That's SIGCHLD.

Broadly, SIGCHLD enables programs to react to their children exiting even when they're doing other things too. This is useful in general, which is probably why System III added its own version even though it didn't have job control.

(Shells that use job control don't have to immediately react this way, and some don't. It may even depend on shell settings, which is sensible; some people don't like getting interrupted by shell messages when they're typing a command or thinking, and would rather tap the Return key when they want to see.)

PS: 4.2 BSD also introduced wait3(), which for the first time allowed processes to check for child process status without blocking. This is what you'd use in a shell if you're only checking for exited and suspended children right before you print the next prompt.

Written on 03 January 2020.
« The good and bad of errno in a traditional Unix environment
Three ways to expose script-created metrics in Prometheus »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Jan 3 02:46:19 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.