Wandering Thoughts archives

2014-03-28

Recovering from a drive failure on Fedora 20 with LVM on software RAID

My office workstation runs on two mirrored disks. For various reasons the mirroring is split; the root filesystem, swap, and /boot are directly on software RAID while things like my home directory filesystem are on LVM on top of software RAID. Today I had one of those two disks fail when I rebooted after applying a kernel upgrade; much to my surprise this caused the entire boot process to fail.

The direct cause of the boot failure was that none of the LVM-based filesystems could be mounted. At first I thought that this was just because LVM hadn't activated, so I tried things like pvscan; much to my surprise and alarm this reported that there were no physical volumes visible at all. Eventually I noticed that the software RAID array that LVM sits on top of being reported as inactive instead of active and that I couldn't read from the /dev entry for it.

The direct fix was to run 'mdadm --run /dev/md17'. This activated the array (and then udev activated LVM and systemd noticed that devices were available for the missing filesystems and mounted them). This was only necessary once; after a reboot (with the failed disk still missing) the array came up fine. I was led to this by the description of --run in the mdadm manpage:

Attempt to start the array even if fewer drives were given than were present last time the array was active. Normally if not all the expected drives are found and --scan is not used, then the array will be assembled but not started. With --run an attempt will be made to start it anyway.

In theory this matched the situation; the last time the array was active it had two drives and now it only had one. The mystery here is that the exact same thing was true for the other mirrors (for /, swap, and /boot) and yet they were activated anyways despite the missing drive.

My only theory for what happened is that something exists that forces activation of mirrors that are seen as necessary for filesystems but doesn't force activation of other mirrors. This something is clearly magical and hidden and of course not working properly. Perhaps this magic lives in mount (or the internal systemd equivalent); perhaps it lives in systemd itself. It's pretty much impossible for me to tell.

(Of course since I have no idea what component is responsible I have no particularly good way to report this bug to Fedora. What am I supposed to report it against?)

(I'm writing this down partly because this may sometime happen to my home system (since it has roughly the same configuration) and if I didn't document my fix and had to reinvent it I would be very angry at myself.)

Fedora20LVMDriveRecovery written at 18:05:00; Add Comment

2014-03-20

Killing (almost) all processes on Linux is not recoverable

Suppose that you have at least a semi-hung system that you're taking drastic measures to get at least semi-alive again; for example, you might use Magic Sysrq's option to send a SIGTERM or SIGKILL to all processes except init ('e' or 'i', per here). If you do this, it's quite possible that your system will stagger dazedly around for a bit and then seem to come back to life. Oh, sure, maybe you need to restart a few daemons, but it can easily look like you can keep going without actually rebooting the machine. You can, right?

Based on painful experience, let me answer the question simply: no.

In practice there is no even vaguely easy way to recover a modern Linux system to full functionality after you've killed almost all processes. You can get something back that looks like it's working, but what you really have is a partial zombie. You can spend quite literally months finding things in the corners that are not working; if you're lucky, they will be not working in some noisy way and diagnosing them will be obvious. It's quite possible to not be lucky.

So if you are ever in a situation like this with Magic Sysrq or the like, reboot your system after using drastic actions to wake it up even if it seems okay afterwards. Things like Sysrq-e and Sysrq-i are for temporary diagnostics (to answer questions like 'is this hang probably because of a user-level process doing bad things'), not for cures. The cure is a reboot.

Another way to do this is an accidental 'kill -SIGNAL -1' for some signal that your init ignores. As an interesting example, it appears that systemd ignores SIGHUP so the traditional accidental 'kill -1 -1' as root might do this on a systemd system. After something like this your system may look fine, especially after you restart some daemons, but it is not. Reboot. Really. It's simpler and much less painful over the long run and you're going to wind up doing it sooner or later anyways.

PS: as I found out in the same incident, immediately turn up the log level when using Magic Sysrq.

KillAllNotRecoverable written at 00:17:49; Add Comment

2014-03-07

Coming to terms with D-Bus

Thinking back, I probably first encountered DBus as a new magic part of Gnome. I didn't really give it any attention then or in fact for a long time afterwards and when I did start to pay some attention it was with a degree of old-Unix grumpyness. From my vague brushes with it, DBus certainly felt like yet another overcomplicated solution to everything spawned by a big GUI desktop. Like many people I've felt a certain amount of unease as DBus wormed its way deeper and deeper into the heart of my systems (which it clearly has; it's even running on our servers).

I won't say that I've come to love DBus yet. What I have come around to is a somewhat less grumpy view of it. Put simply, DBus is a solution to Unix's IPC problem, namely the lack of standard higher level IPC. It seems to be that these problems tend to surface most clearly in desktops, but any program that does IPC needs to define its message formats, have some way to be found by things that want to talk to it, and possibly worry about security. The traditional Unix way has been for everything to (re)invent its own way of doing all of this but that involves a lot of duplication of effort (and makes it hard to talk to programs except with the program's own custom tools). D-Bus is an attempt to implement all of the additional levels of IPC once so that we can be done with it.

In the sense of mathematical system minimalism, D-Bus isn't necessary. More exactly, all of the things that D-Bus is used for at the system level could define their own protocols and provide their own tools. You would have one program (and a private protocol) for talking to init, another system for the hotplug manager to notify other programs about new hardware, and so on and so forth. That is in fact the old Unix way and frankly it was a mess. A theoretically minimal mess, but still a mess.

With D-Bus you accept a certain amount of overkill for some jobs in the interests of not reinventing wheels over and over again and instead using a common set of infrastructure. D-Bus provides all the higher levels of IPC, both 'point to point' direct communication and a message bus; all you have to do is plug programs in to it and perhaps write some configuration files to define permissions. In this sense it really is necessary and thus I've come around to accepting it even if I don't exactly love it.

(I don't think that D-Bus is necessarily Unixy but then I don't think we yet know how to build a Unixy version of the higher levels of IPC. To the extent that we have an idea of how to do a Unixy message bus, I think it requires much better support for virtual filesystems than Linux has now. And the closest we have to a Unixy common message format is JSON.)

AcceptingDBus written at 01:18:06; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.