2010-07-14
Making the Linux kernel shut up about segfaulting user programs
Relatively modern versions of the Linux kernel on some architectures
default to logging a kernel message about any process that receives
various unhandled signals. The exact details depend on the architecture,
but I think it is common for the architectures to log messages about
unhandled SIGSEGV
s, ie segmentation faults.
(Most architectures are capable of logging these messages if the feature is turned on, but not all architectures turn it on by default. It appears that x86 (both 32-bit and 64-bit) and SPARC default to having it on, and PowerPC and S390 default to having it off. SPARC support for this is recent, having only been added around March of this year)
I find these messages rather irritating, given that we run multiuser systems that are full of users running various programs, some of them locally written and some of them just buggy. They clutter up our kernel message logs, making it harder to notice other things, and there is no useful information for us in them.
(The kernel message is ratelimited so that it can't flood the logs, but given the relatively low volume of kernel messages in general it can easily be the dominating message.)
On all architectures that support this, whether it is on or
off is controlled by the debug.exception-trace
sysctl (aka
/proc/sys/debug/exception-trace); a value of 1 means that the kernel
logs messages, a 0 means that it doesn't. On the S390, there is a second
sysctl that also controls it, kern.userprocess_debug
, which is
presumably still there for historical reasons.
(This is the kind of entry that I write so that I can find it later.)
Sidebar: the kernel code itself
The kernel sorce code variable that controls this behavior is
show_unhandled_signals
. It's almost entirely used (and defined) in
architecture dependent code, which is why it has different defaults and
behaviors on different architectures. There is one conditional bit in
general kernel code, in kern/sysctl.c
, to define the sysctl itself.
The challenges of shared spares in RAID arrays
It's getting popular these days for RAID implementations to support what I've heard called 'shared spares'; spare disks that are shared between multiple RAID arrays, so that that they can be used by any array that happens to need them. This is an attractive idea because it gives you better protection against moderate problems than you could get with dedicated spares. (If you have large problems you run out of spares, of course.)
The problem with shared spares is that they are pretty much intrinsically hard to do well in the general case, once you get beyond simple configurations and start working at larger scales. I'll use our fileservers as an example.
Our fileservers have 'RAID arrays' (ZFS pools) of varying sizes that are made up of some number of mirror pairs from two different iSCSI backends per fileserver. Suppose a disk fails in some pool; clearly, if possible we want to replace that disk with another disk from the same iSCSI backend so that we maintain cross-backend redundancy.
Suppose that several disks fail at once, in a situation where we have too few suitable spares to restore all affected pools to full redundancy. In this situation we want as many pools as possible restored to full redundancy, as fast as possible; we'd rather have two smaller pools be fully redundant than one much larger pool be 2/3rds redundant (two out of three mirrors restored to full operation).
Large setups are like this: their disks don't have a flat topology, and they have policy issues surrounding what should be done in situations with limited resources or what should be prioritized first. I'm sure that you can support all of this in a general RAID shared spares system if you try hard enough, but you're going to have a very complex configuration system; it'll practically be a programming language.
(In theory issues of selecting the right spare disk just need a sufficiently smart general algorithm that knows enough or is told enough about the real disk topology. But policy issues of what gets priority can't be sorted out that way.)
Sadly, large systems with lots of RAID arrays are also exactly the situation where you want shared spares. From this I conclude that your shared spares system should be modular, so that sites have a place to plug in different and more sophisticated methods of selecting what disk to use and what RAID arrays to heal first (or at all).