All your servers should have Linux's magic SysRq enabled
This is effectively another lesson learned from our recent building power shutdown. I will put it simply:
All of your servers should have magic SysRq enabled.
There are reasons to not do this on client machines (but not necessarily very good ones), but none on your servers (which certainly should have their hardware and consoles in a secure location).
What magic SysRq is good for on servers (above everything else) is giving you a last ditch chance to shut down or reboot the machine in something approaching an orderly way. I'm not just talking about if the system goes crazy, because it's also quite possible for ordinary system shutdowns to hang, especially if you're shutting down a group of systems that have complex NFS filesystem relationships and something went down out of order. If this happens and you don't have magic SysRq support available, you're plain out of luck; all you can do is pull the power and hope that nothing is going to explode because it hasn't been killed, had its data synced to disk, or whatever.
With magic SysRq you have at least a chance of doing something about this. You can force a kernel level sync, a kernel level unmount of as many filesystems as possible, and even hit processes with signals if you think it's going to do any good. And then you can reboot the machine (and afterwards, possibly pull the power to keep the machine down).
PS: you should explicitly enabled magic SysRq in your standard server install setup, even if your distribution normally defaults to leaving it on; distribution defaults can change over time. Also, note that if you have a serial console you generally need a getty listening on it in order to make magic SysRq work.
(You can check to see if magic SysRq is enabled by looking at the value
of /proc/sys/kernel/sysrq; a
1 means that it is, a
0 means that it
Comments on this page:Written on 10 May 2012.