Wandering Thoughts archives

2023-07-02

The evolving Unix attitudes on handling signals in your code

Once upon a time, back in V7 Unix or so, Unix signal handling in programs was nominally very simple. You'd set a signal handler with signal(2), and then when it was invoked it would do things, possibly including using longjmp(3) to pop back to the top level of your programs. Among other examples, the Bourne shell famously used SIGSEGV as a memory allocation method. Even today, a lot of programs behave as if this is the signal handling model in effect; you can interrupt your shell, or your pager, or various other things with a Ctrl-C and they'll act like this, popping back to a top level or cleanly stopping their current action while still operating in general (instead of just exiting the way simpler programs do).

In actual reality, even in V7 signal handling could be potentially chancy. The problem with handling signals in the V7 way is that they're interrupts, which means that they can happen at arbitrary points in program execution. If a signal arrives at the wrong point, it will interrupt program execution half-way through doing something that wasn't designed to be interrupted, for example half way through doing a malloc() or free(), and then various havoc can ensue. In V7 I think there weren't all that many critical points like this (in the C library or in programs), and in general the possibility was mostly ignored outside of a few programs that took care to block signals around their critical operations. If something went wrong, the person using the program would deal with it.

(This was in general the V7 way; it was a simple operating system so it had simple implementations that often punted on harder problems.)

To simplify the story, as Unix grew both programs and the C library became more complex, with more complex internal operations going on, and people became less tolerant of flaky programs than they might have been in a simple research operating system. Eventually people began to think about threads, and also about standardizing what signal handlers could legally do as part of POSIX. This resulted in the current situation where POSIX signal handlers are very constrained in what they can legally do, especially in threaded programs. To simplify things, you can call some C library functions (primarily to interact with the operating system), or set a flag, and that's about it. A particular Unix may go beyond the POSIX requirements to make other things safe in signal handlers, and programs may break these requirements and still get away with it most of the time, but today there isn't much you can safely do in a signal handler within the C API.

(A non-C language on Unix may or may not have to restrict itself to the C API behavior in its signal handling, depending on how much it relies on the C library.)

With effort it's still possible to write reliable Unix programs that handle signals and behave as people expect them to. But it's not trivial, and in particular it's not trivial to present an API to programs so that they can handle signals as if they were on V7, with their 'signal handlers' free to do pretty much anything and make broad transfers of control without restriction. For a start, if you offer this API to programs, their signal handlers can't be real signal handlers and by extension you need a runtime to catch the actual signals, set status flags, and then invoke the 'signal handlers' outside of the actual Unix signal delivery.

(This is how (C)Python handles signals, for example. I believe that Go on Linux handles signals outside of the C API, and as part of that manages handling locking and coordination on its own.)

PS: The POSIX signal handler requirements are also only a promise about C (POSIX) API functions, not about what functions in your own program may or may not be safe to call from your signal handlers. If you manipulate data structures or have internal locking in your program, or in libraries that you call, interacting with things safely from within a signal handler is your own responsibility. POSIX makes no promises.

PPS: I'm not sure if restrictions on what signal handlers should do were ever written down before POSIX. The 4.3 BSD sigvec(2) and signal(3) manual pages don't contain any cautions, for example.

Sidebar: threads and signals

Once you introduce threads, many operations in the C library may start requiring locks. Once you have operations taking locks, it becomes quite dangerous to call back in to anything related to those locks in a signal handler. If you get a signal at the wrong time, some thread will attempt to recursively obtain a lock and then probably deadlock. Introducing threads to your C library model forces you to think about locks, deadlocks, and preventing them, and now you can't hand-wave signal safety any more.

(Not that you ever could, but threads make it basically impossible to think you can get away with it, because the failure modes are so obvious.)

PS: I don't know how this interacts with POSIX sigsetjmp() and siglongjmp(), since siglongjmp() is listed as one of the POSIX functions that's safe to call in a signal handler.

unix/SignalHandlingOverTime written at 23:17:45;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.