The problem with PID files

April 6, 2008

The problem with PID files is that while you can use a PID to verify that your daemon isn't running, you can't use it alone to verify that your daemon is in fact present. You have to use other mechanisms, like trying to communicate with it, to verify that whatever has the PID actually is your daemon, instead of something that has inherited the PID.

Most of the time the risk that people think about with this is PID rollover: your daemon dies, the system churns through tens of thousands of processes, and something else has reused the daemon's old PID by the time you get around to trying to restart it. But PID rollover (as rare as it is in practice) is not the only problem.

Consider a daemon that is started relatively early in the system's life (at boot time or shortly afterwards), and that stores its PID file in a non-volatile location, one that isn't scrubbed at boot time. Sooner or later one of the other things that get started early on are going to get the PID that your daemon was using last time around, and then things go to heck.

(This is exactly the problem I have with pulseaudio on my home machine. My work machine gets rebooted much less frequently, which both lowers the chance that it will happen and increases the chance that the stuff that removes old unused files from /tmp will have deleted the PID file by the time I reboot.)

Also, please do not use your PID file to send signals to your daemon before first verifying that it actually is your daemon. If the process is not your daemon, there is no guarantee that it will be at all happy to receive a SIGUSR1 or whatever out of the blue.

Written on 06 April 2008.
« What I needed to make my custom Fedora 8 environment work
Get statistics »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Apr 6 22:22:40 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.