x86 servers, ATX power supply control, and reboots, resets, and power cycles
I mentioned recently a case when power cycling an (x86) server wasn't enough to recover it, although perhaps I should have put quotes around "power cycling". The reason for the scare quotas is that I was doing this through the server's BMC, which means that what was actually happening was not clear because there are a variety of ways the BMC could be doing power control and the BMC may have done something different for what it described as a 'power cycle'. In fact, to make it less clear, this particular server's BMC offers both a "Power Cycle" and a "Power Reset" option.
(According to the BMC's manual, a "power cycle" turns the system off and then back on again, while a "power reset" performs a 'warm restart'. I may have done a 'power reset' instead of a 'power cycle', it's not clear from what logs we have.)
There are a spectrum of ways to restart an x86 server, and they (probably) vary in their effects on peripherals, PCIe devices, and motherboard components. The most straightforward looking is to ask the Linux kernel to reboot the system, although in practice I believe that actually getting the hardware to do the reboot is somewhat complex (and in the past Linux sometimes had problems where it couldn't persuade the hardware, so your 'reboot' would hang). Looking at the Linux kernel code suggests that there are multiple ways to invoke a reboot, involving ACPI, UEFI firmware, old fashioned BIOS firmware, a PCIe configuration register, via the keyboard, and so on (for a fun time, look at the 'reboot=' kernel parameter). In general, a reboot can only be initiated by the server's host OS, not by the BMC; if the host OS is hung you can't 'reboot' the server as such.
Your x86 desktop probably has a 'reset' button on the front panel. These days the wire from this is probably tied into the platform chipset (on Intel, the ICH, which came up for desktop motherboard power control) and is interpreted by it. Server platforms probably also have a (conceptual) wire and that wire may well be connected to the BMC, which can then control it to implement, for example a 'reset' operation. I believe that a server reboot can also trigger the same platform chipset reset handling that the reset button does, although this isn't sure. If I'm reading Intel ICH chipset documentation correctly, triggering a reset this way will or may signal PCIe devices and so on that a reset has happened, although I don't think it cuts power to them; in theory anything getting this signal should reset its state.
(The CF9 PCI "Reset Control Register" (also) can be used to initiate a 'soft' or 'hard' CPU reset, or a full reset in which the (Intel) chipset will do various things to signals to peripherals, not just the CPU. I don't believe that Linux directly exposes these options to user space (partly because it may not be rebooting through direct use of PCI CF9 in the first place), although some of them can be controlled through kernel command line parameters. I think this may also control whether the 'reset' button and line do a CPU reset or a full reset. It seems possible that the warm restart of this server's BMC's "power reset" works by triggering the reset line and assuming that CF9 is left in its default state to make this a CPU reset instead of a full reset.)
Finally, the BMC can choose to actually cycle the power off and then back on again. As discussed, 'off' is probably not really off, because standby power and BMC power will remain available, but this should put both the CPU and the platform chipset through a full power-on sequence. However, it likely won't leave power off long enough for various lingering currents to dissipate and capacitors to drain. And nothing you do through the BMC can completely remove power from the system; as long as a server is connected to AC power, it's supplying standby power and BMC power. If you want a total reset, you must either disconnect its power cords or turn its outlet or outlets off in your remote controllable PDU (which may not work great if it's on a UPS). And as we've seen, sometimes a short power cycle isn't good enough and you need to give the server a time out.
(While the server's OS can ask for the server to be powered down instead of rebooted, I don't think it can ask for the server to be power cycled, not unless it talks to the BMC instead of doing a conventional reboot or power down.)
One of the things I've learned from this is that if I want to be really certain I understand what a BMC is doing, I probably shouldn't rely on any option to do a power cycle or power reset. Instead I should explicitly turn power off, wait until that's taken, and then turn power on. Asking a BMC to do a 'power cycle' is a bit optimistic, although it will probably work most of the time.
(If there's another time of our specific 'reset is not enough' hang, I will definitely make sure to use at least the BMC's 'power cycle' and perhaps the full brief off then on approach.)
|
|