Checking to see if a process is alive (on Linux)

December 2, 2018

For a long time I've used the traditional Unix way of checking to see if a given process was (still) alive, which is sending it the special signal of 0 with 'kill -0 <PID>'. If you're root, this only fails if the process doesn't exist; if you're not root, this can also fail because you lack the required permissions and sorting that case out is up to you.

(For the kill command, you'll need to scan the error message. If you can directly use the system call, you want to check for the difference between an EPERM and an ESRCH error.)

This is an okay method but it has various drawbacks in shell scripts (even when you're root). Today it struck me that there is another alternative on Linux; you can just check to see if /proc/<PID> exists. In a shell script this is potentially a lot more convenient, because it's very simple:

if [ -e /proc/$PID ]; then

It's easy to invert, too, so that you take action when the PID doesn't exist (just use '! -e /proc/$PID').

I was going to say that this had a difference from the kill case that might be either an advantage or a fatal drawback, but then I decided to test Linux's behavior and I got a surprise (which maybe shouldn't have been a surprise). Linux threads within a process have their own PIDs, which I knew, and these PIDs also show up in /proc, which I hadn't known. Well, they sort of show up.

Specifically, the /proc/<PID> directories for threads are present in /proc if you directly access them, for example by doing '[ -e /proc/NNNN ]'. However, they are not visible if you just get a directory listing of /proc (including with such things as 'echo *' in your shell); in a directory listing, only full processes are visible. This is one way of telling whether you have a process or a thread. Another way is that a thread's /proc/<PID>/status file has a different Tgid than its Pid (for a discussion of this, see the manual page for proc(5)).

(Whether or not excluding threads is a feature or a serious limitation depends on your usage case. If you know that the PID you're checking should be a main process PID, not a thread, then only seeing them will help you avoid false positives from things like PID rollover. As I've encountered, rapid PID rollover can definitely happen to you in default Linux configurations.)

PS: FreeBSD and Illumos (and so OmniOS and other Illumos derivatives) also have a /proc with PIDs visible in it, so this approach is at least somewhat portable. OpenBSD doesn't have a /proc (Wikipedia says it was dropped in 5.7), and I haven't looked at NetBSD or Dragonfly BSD (I don't have either handy the way I have the others).

Comments on this page:

By Ben Hutchings at 2018-12-08 22:00:23:

You're not really distinguishing between threads and processes, but whether a thread is the first thread of a process (which has task ID equal to the process ID).

Using procfs to check for the existence of a task or process also isn't going to be generally reliable because the "hidepid" mount option can be used to make other users' process entries unreadable or invisible.

Written on 02 December 2018.
« Today I (re-)learned that top's output can be quietly system dependent
Linux disk IO stats in Prometheus »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Dec 2 02:55:45 2018
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.