Wandering Thoughts archives

2009-05-22

How CPython handles (and delays) Unix signals

CPython handles Unix signals somewhat oddly, or at least not in what you might think of as a standard Unix way. I've covered part of this before in SignalProblem and made side mentions in other entries, but I want to write all of this down in one place (even if only as an index).

First, none of this applies if you set a signal's handler to SIG_IGN or SIG_DFL. Those get standard Unix semantics, because CPython just sets the signal to either of those at the C level.

Otherwise, the important thing to know is that Python signal handler functions are not real signal handlers; they are just functions that the bytecode interpreter calls (at some point) in response to CPython receiving a signal. Real signal handlers are called immediately when the process gets a signal and they generally won't get re-entered if the process gets the same signal again while they're running. CPython 'signal handlers' have no reentrancy protection and their execution can be delayed (sometimes for quite a while) from when the signal was sent.

(Technically a Python signal handler can be any callable object. I'm (mis)using 'function' for shortness.)

Mechanically, the only thing CPython does in the real signal handler is note down some information about the signal and set a flag for the bytecode interpreter (the actual code has more levels of indirection than this description). The interpreter checks the flag after most bytecode instructions (there are a few special 'fast' instructions that go directly to the next instruction, bypassing various checks); if the flag is set and this is the main thread, the interpreter then invokes the Python-level signal handler (including the 'raise an exception' SIGINT handler).

So, at least two things will delay your signal handlers, possibly for quite a while:

  • your code is in a C-level module waiting for a result (in a way that won't get interrupted by a signal). The one that I frequently run into is socket.gethostbyname(), as it waits for DNS timeouts (okay, technically it's waiting for DNS answers, but it's not going to get them).

  • your main thread is waiting to be poked by another thread. Judging from the code, the thread locking and synchronization primitives aren't aborted by signals, although some of the higher level operations are implemented in Python and so are semi-interruptible.

(Until I started looking at this, I had not really noticed the thread case.)

While making Python signal handlers not run from inside the actual signal handlers sounds limiting, it is basically the only choice that CPython has; there is very little that you can safely do in a signal handler besides carefully set some flag variables, because not even the C library itself is arbitrarily reentrant.

Pretty much all of this is mentioned, although not in large flaming letters, in the start of the signal module documentation.

python/CPythonSignals written at 02:33:29; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.