Why 'hotplug' approaches to device handling are the right way
The other day I mentioned that modern versions of Linux handle things like activating software RAID devices through their mechanisms for dynamically appearing devices, which is often called 'hotplugging'. Some people don't like this approach; one way that people put it is that hotplug is fine for the desktop (or laptop) where devices come and go, but servers have a constant set of hardware so shouldn't need the complexity and inherent unpredictable asynchronicity of the whole process. It's my strong opinion that this is wrong in practice and even on servers with constant hardware, hotplug is the right approach on technical grounds.
The problem, even on servers, can be summed up as 'device enumeration'. It's been a long time since the operating system could rapidly and predictably detect devices (since at least the introduction of SCSI and probably even before then). Modern busses and environments require a significant amount of time and probing before you can be sure that you've seen every device available. Multiply this by a number of busses and things get even worse. Things get even worse once you add software based devices such as iSCSI disks because you may need to bring up parts of the operating system before you can see all of the devices you need to fully boot.
You can try to make all of this work by adding longer and longer delays
before you look for all of the necessary devices (and then start
throwing errors if they're not there), and then layer a bunch of complex
topology and specific dependency awareness on top of it to get things
like iSCSI to work. But all of this is fundamentally a hack and it can
easily break down (and it has in the past when people tried this). You
wind up telling sysadmins to insert 'sleep 30
' statements in places
to get their systems to work and so on. No one like this.
A hotplug based asynchronous system is the simpler, better approach. You give up on trying to wait 'long enough' for device enumeration to finish and simply accept that it's an asynchronous process where you handle disks and other devices as they appear and as parts of the system become ready you boot more and more fully (and if something goes badly wrong you let the operator abort the process). At a stroke this replaces a rickety collection of hacks with a simple, general approach. As a bonus it gracefully handles things going a little bit wrong (for example, your external disk enclosure powering up and becoming visible a bit after the server starts booting, instead of before).
Having said that I don't think anyone today has a perfect hotplug based boot system. But I do think they're getting there and they're much more likely to manage it than anything built on the old linear way of booting.
|
|