Unused hardware in computers can now be distinctly inactive
When I wrote about monitoring the state of Linux network interfaces with Prometheus, I wished for a way to tell if there was link carrier on an unused network interface (one that had not been configured as 'up' in the (Linux) kernel). Ben Hutchings commented on the entry to say (in part):
This is more than just a restriction of the kernel's user-space APIs. Network drivers are not expected to update link state while an interface is down, and they may not be able to. When an interface is down, it's often in a low power state where the PHY may not negotiate a link, or where the driver is unable to query the link status or receive link interrupts.
I'm from the era where both computer hardware and Unix device drivers were simpler things than they are today. Back in those days, you could safely assume that hardware became active once its device driver had initialized it, and it was to some degree running from then onward. You (the hardware driver) might tell the hardware to turn off interrupts if you weren't using it (for example, if no one had a serial port open or a network interface configured), but the hardware itself was generally operating and could at least have its status checked if you wanted to. If you wanted hardware to be genuinely inactive, you wanted to leave it untouched even by the driver, which often meant not configuring the driver into the kernel at all.
In this world, expecting that an unused network interface would still have link carrier information available was a natural thing. Once the kernel driver initialized the hardware (and perhaps even before then, from when power was applied), the hardware was awake and responding to the state of the outside world. Being unused was a higher level kernel issue, one that was irrelevant to the hardware itself.
Those days are long over. Today, hardware is much more complicated and can be controlled at a much finer level, so there are many more things that unused hardware may not be doing. As Hutchings noted, the kernel driver will probably opt to set unused hardware into some sort of low power state, and in this state it won't be doing all sorts of things that I expect. General kernel subsystems may go further, perhaps with driver support, doing things like turning down PCIe links or turning on larger scale power savings modes. I've already seen that PCIe slot bandwidth can change dynamically and that this can happen over surprisingly short time periods.
(One corollary and sort of inverse of this is that merely looking at the current status of some piece of hardware may push it into a more active, higher power state, because otherwise it can't answer your questions about how it is. We've seen this before, where if you told some HDs to power down and then asked them about their SMART status, they powered back up.)
Some of this capability probably comes as a free ride on what was necessary to support low power operation in laptops. With hardware standards like PCIe, driver and kernel support, and probably even some degree of PCIe interface chipsets (and chipset functional block IP) shared between laptops and other platforms, servers (and desktops) might as well benefit from the power savings advantages.
(People with large, high density server installs are probably also using servers that have exactly and only the hardware that they're actually going to use, so I wouldn't expect them to get much power savings from these features. But perhaps I'm wrong and they can routinely turn down network interfaces and so on.)
PS: This has also created the situation where at least some servers can genuinely suspend themselves to RAM (entering the ACPI S3 state) as if they were laptops, although getting them to un-suspend themselves is usually nowhere near as simple as opening your laptop again.
|
|