Wandering Thoughts archives


What /proc/[pid]/stat's process state means and where it comes from

We recently updated to a version of the Prometheus host agent that can report how many of your processes are in various states. Naturally this caused me to look at our systems, where I was surprised to find that we had a bunch of processes in what the host agent called state 'I', which I had never heard of before. This gave me two questions, namely where did the host agent get this state information from, and what did it mean.

The answer to the first question is that the process state being reported comes from /proc/[pid]/stat:

1 (systemd) S [...]

The various fields in this are covered in the proc(5) manpage; the state is the third field. The manpage documents a number of possible values but doesn't include 'I', so clearly there's more.

The authoritative source for these flags is fs/proc/array.c's task_state_array array, and I might as well just quote the current 5.0-rc7 version here directly because it turns out to explain everything very well:

 * The task state array is a strange "bitmap" of
 * reasons to sleep. Thus "running" is zero, and
 * you can test for combinations of others with
 * simple bit tests.
static const char * const task_state_array[] = {

       /* states in TASK_REPORT: */
       "R (running)",          /* 0x00 */
       "S (sleeping)",         /* 0x01 */
       "D (disk sleep)",       /* 0x02 */
       "T (stopped)",          /* 0x04 */
       "t (tracing stop)",     /* 0x08 */
       "X (dead)",             /* 0x10 */
       "Z (zombie)",           /* 0x20 */
       "P (parked)",           /* 0x40 */

       /* states beyond TASK_REPORT: */
       "I (idle)",             /* 0x80 */

(While /proc/[pid]/stat shows only the first letter, the full text shows in /proc/[pid]/status, which may save you a look into the kernel source if Linux someday adds additional states.)

On our machines, the processes that I see in state 'I' tend to be kernel threads (or if you prefer, kernel processes). Having looked at the kernel code, I believe it is probably only possible for kernel threads to be in this state; see the sidebar.

While proc(5) will tell you that processes can only wind up in state 'P' up to kernel 3.13, this is not quite true. Kernel threads associated with offlined CPUs will go into state 'P' on at least the Ubuntu 18.04 4.15 based kernel, and I suspect on any kernel. However, this is not likely to be a common situation.

(We have such a machine for unusual reasons.)

The usual cause for a process being in state 'T' is that it has been SIGSTOP'd (more or less), either due to shell job control suspending it or because someone is using SIGSTOP for other purposes.

One regular case of processes winding up in state 't' is that they're being run under a debugger and the debugger has stopped them (perhaps because they've hit a breakpoint). Given that strace also uses the ptrace(2) system call, I wouldn't be surprised to see strace'd processes also show up in state 't'.

As a side note, inside the kernel the actual task states are much more complicated than this. You can start to see the sausage made in include/linux/sched.h.

Sidebar: What the 'idle' state appears to mean

Based on reading sched.h and various other bits of the kernel source, an 'idle' task is a sleeping uninterruptible kernel thread that is not supposed to contribute to the load average. Normally, processes that are doing uninterruptible sleeps in the kernel contribute to the load average on Linux (although not on all Unixes). I believe that this makes 'I' a state that is currently exclusive to kernel threads and there isn't a directly exposed way of putting a user process into this state.

(You really wouldn't want there to be a direct API for this, because you normally want to be able to interrupt sleeping user processes with things like SIGKILL. When the kernel says 'uninterruptible' here, it really means uninterruptible.)

linux/ProcPidStatState written at 22:46:59; Add Comment

The cliffs in the way of adding tests to our Django web app

Back in August of last year, I wrote that it was time for me to start adding tests to our Django web app. Since then, the number of tests I have added is zero, and in fact the amount of work that I have done on our Django web app's code is also essentially zero (partly because it hasn't needed any modifications). Part of the reason for that is that adding tests feels like make-work, even though I know perfectly well that it's not really, but another part of it is that I'm staring at two reasonably substantial cliffs in my way.

Put simply, in order to add tests that I actually want to keep, I need to learn how to write Django tests and then I need to figure out what we want to test in our Django web app (and how). Learning how to write tests means reading through the Django documentation on this, both the quick tutorial and the real documentation. Unfortunately I think that I need to read all of the documentation before I start writing any tests, and possibly even plan to throw away the first round of tests as a learning experience. Testing a Django app is not as simple as testing standalone code; there is a test database you need to construct, an internal HTTP client so that you can write end to end tests, and so on. This is complicated by the fact that by now I've forgotten a lot of my general Django knowledge and I know it, so to some extent I'm going to have to re-learn Django (and re-learn our web app's code too).

(It's possible that I can find some quick-start tests I can write more or less in isolation. There are probably some stand-alone functions that I can poke at, and perhaps even stand-alone model behavior that doesn't depend on the database having a set of interlinked base data.)

Once I sort of know how to write Django tests, I need to figure out what tests to write and how much of them. There are two general answers here that I already know; we need tests that will let us eventually move to Python 3 with some confidence that the app won't blow up, and I'd like tests that will do at least basic checks that everything is fine when we move from Django version to Django version. Tests for a Python 3 migration should probably concentrate on the points where data moves in and out of our app, following the same model I used when I thought about DWiki's Python 3 Unicode issues. Django version upgrade tests should probably start by focusing on end to end testing (eg, 'can we submit a new account request through the mock HTTP client and have it show up').

All of this adds up to a significant amount of time and work to invest before we start to see real benefits from it. As a result I've kept putting it off and finding higher priority work to do (or at least more interesting work). And I'm pretty sure I need to find a substantial chunk of time in order to get anywhere with this. To put it one way, the Django testing documentation is not something that I want to try to understand in fifteen minute blocks.

PS: It turns out that our app actually has one tiny little test that I must have added years ago as a first step. It's actually surprisingly heartening to find it there and still passing.

(As before, I'm writing this partly to push myself toward doing it. We now have less than a year to the nominal end of Python 2, which is not much time with everything going on.)

Sidebar: Our database testing issue

My impression is that a decent amount of Django apps can be tested with basically empty databases, perhaps putting in a few objects. Our app doesn't work that way; its operation sits on top of a bunch of interlinked data on things like who can sponsor accounts, how those accounts should be created, and so on. Without that data, the app does nothing (in fact it will probably fail spectacularly, since it assumes that various queries will always return some data). That means we need an entire set of at least minimal data in our test database in order to test anything much. So I need to learn all about that up front, more or less right away.

python/DjangoMyTestingCliffs written at 00:20:30; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.