Killing (almost) all processes on Linux is not recoverable

March 20, 2014

Suppose that you have at least a semi-hung system that you're taking drastic measures to get at least semi-alive again; for example, you might use Magic Sysrq's option to send a SIGTERM or SIGKILL to all processes except init ('e' or 'i', per here). If you do this, it's quite possible that your system will stagger dazedly around for a bit and then seem to come back to life. Oh, sure, maybe you need to restart a few daemons, but it can easily look like you can keep going without actually rebooting the machine. You can, right?

Based on painful experience, let me answer the question simply: no.

In practice there is no even vaguely easy way to recover a modern Linux system to full functionality after you've killed almost all processes. You can get something back that looks like it's working, but what you really have is a partial zombie. You can spend quite literally months finding things in the corners that are not working; if you're lucky, they will be not working in some noisy way and diagnosing them will be obvious. It's quite possible to not be lucky.

So if you are ever in a situation like this with Magic Sysrq or the like, reboot your system after using drastic actions to wake it up even if it seems okay afterwards. Things like Sysrq-e and Sysrq-i are for temporary diagnostics (to answer questions like 'is this hang probably because of a user-level process doing bad things'), not for cures. The cure is a reboot.

Another way to do this is an accidental 'kill -SIGNAL -1' for some signal that your init ignores. As an interesting example, it appears that systemd ignores SIGHUP so the traditional accidental 'kill -1 -1' as root might do this on a systemd system. After something like this your system may look fine, especially after you restart some daemons, but it is not. Reboot. Really. It's simpler and much less painful over the long run and you're going to wind up doing it sooner or later anyways.

PS: as I found out in the same incident, immediately turn up the log level when using Magic Sysrq.


Comments on this page:

By himdel at 2014-03-20 18:09:07:

I usually like your arcticles but this one is completely useless because it doesn't really cite any examples, except for the NFS thing on twitter (and that one is not really saying anything) or any reasons.

So, I for one call bullshit, until you tell me something specific. And what does half-zombie even mean?

Essentially, IMHO, the system after killing all processes and going throught /etc/rc?.d to start everything again is exactly the same as if you didn't kill anything at all, care to enlighten me?

By cks at 2014-03-20 18:55:18:

The specific killer case we ran into in the end was getting nlockmgr (NFS lock management) re-registered with the portmapper after the latter was killed and restarted. You cannot kill and restart the kernel process that apparently handles this (I believe it's 'lockd'); it's apparently supposed to exit if you unmount all NFS filesystems, but for us that is effectively equivalent to rebooting the system because all users have NFS home directories so to unmount all NFS filesystems we have to kill all user processes.

Beyond that, after the initial forced-kill Upstart restarted some but not all daemons so we spent some time tripping over daemons that weren't running. Finding all of these is not trivial on a modern Ubuntu system because you have to pick through not just /etc/rc?.d but also /etc/init (and not everything in /etc/init needs restarting and so on and so forth).

Overall the system state on the Ubuntu server after the forced kill was very misleading. A lot of basic services were running (eg you could ssh in or log in on the console and everything looked normal) but all sorts of less obvious things were missing (and I believe that at least some of them didn't restart with just an '/etc/init.d/<whatever> start', perhaps because their 'am I running already' checking wasn't working quite right).

By himdel at 2014-03-22 04:52:45:

OK, thanks for the answer, that explains it.

NFS is a bitch, and if I had to pick only one thing that bothers me with linux, it's the handling of unavailable filesystems. (NFS feels like the worst offender to me, but even umounting an ext3 filesystem after the disk has gone away is a pain when there's a process waiting to read from it.) These days I tend to use mainly sshfs for my network fs needs, because it's a fuse fs so it's easy to kill the fs process if necessary. But I definitely wouldn't want to boot from it and probably even having a home on it would be painful, it seems to have a problem with concurrent accesses to multiple files.

And I confess I don't know much about Upstart, my sysadmin days are mostly over for now, and I've always stuck to sysvinit, where running `/etc/init.d/rc 2 stop; /etc/init.d/rc 2 start` fixed everything.

Basically my overall experience was that you could recover pretty much anything, but if you used NFS you were probably screwed :).

Written on 20 March 2014.
« Why I like ZFS's zfs send and zfs receive
Thinking about when rsync's incremental mode doesn't help »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Mar 20 00:17:49 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.