2008-04-06
The problem with PID files
The problem with PID files is that while you can use a PID to verify that your daemon isn't running, you can't use it alone to verify that your daemon is in fact present. You have to use other mechanisms, like trying to communicate with it, to verify that whatever has the PID actually is your daemon, instead of something that has inherited the PID.
Most of the time the risk that people think about with this is PID rollover: your daemon dies, the system churns through tens of thousands of processes, and something else has reused the daemon's old PID by the time you get around to trying to restart it. But PID rollover (as rare as it is in practice) is not the only problem.
Consider a daemon that is started relatively early in the system's life (at boot time or shortly afterwards), and that stores its PID file in a non-volatile location, one that isn't scrubbed at boot time. Sooner or later one of the other things that get started early on are going to get the PID that your daemon was using last time around, and then things go to heck.
(This is exactly the problem I have with pulseaudio on my home machine. My work
machine gets rebooted much less frequently, which both lowers the chance
that it will happen and increases the chance that the stuff that removes
old unused files from /tmp
will have deleted the PID file by the time
I reboot.)
Also, please do not use your PID file to send signals to your daemon
before first verifying that it actually is your daemon. If the process
is not your daemon, there is no guarantee that it will be at all happy
to receive a SIGUSR1
or whatever out of the blue.