The spectrum of options when netbooting systems

November 8, 2013

Suppose that you want to run your servers without local system disks; instead you will boot them over the network and somehow supply all of what they need to operate that way. In the abstract you need up to three or four different things for this: a potentially read-only version of the system filesystem(s), a way to get machine-specific configurations on to each server, some per-server volatile writeable space, and perhaps some per-server permanent writeable space. Life is easier if servers don't need the latter and are effectively either volatile or read-only.

There are a spectrum of options to provide these that I can think of:

  • boot to a ramdisk. The ramdisk can be generic if the servers get their machine specific configuration through some other method (including automation frameworks like Chef and Puppet).

    The advantage of this setup is that once booted a machine is self-contained. The drawbacks are the lack of innate non-volatile writeable storage and the amount of memory that a ramdisk image may take up. This is probably best used with very small base system images unless you enjoy losing gigabytes of expensive server RAM.

  • boot to a ramdisk and mount a read-write network filesystem for any non-volatile storage needs.

  • boot to a per machine read-write network filesystem. This requires a potentially big fileserver and managing all of those filesystems but looks the most like normal system disks. The drawback is that it's not clear how much you gain over just having local disks, which is why this sort of plain old fashioned diskless machine has fallen out of favour.

    (You can make some subset of the system filesystems read-only and shared, assuming that your operating system cooperates.)

  • boot to a generic read-only network filesystem and then overlay it with another filesystem (or more) for writeable storage and machine-specific configuration. The overlay may be in a ramdisk or in another network filesystem or both (for different bits); if you use a ramdisk as the overlay, servers must get their specific configuration through some other method.

(I'm stretching 'network filesystem' to include 'network disk space', for example through iSCSI. I'm also probably overlooking some options.)

Any option involving a network filesystem makes all booted servers depend on the fileserver(s) providing their system filesystems; if it goes down or stalls, they probably will too (they might survive if everything they need is already loaded into memory and running). Note that merely having multiple copies of the fileserver doesn't help; you must be able to have clients transparently migrate from one to the other without a reboot (unless reboots are tolerable in your environment).

Any solution except per machine read-write network filesystems requires some mechanism to update and (re)build the master images or filesystems. Unless you're lucky these will not be part of the operating system's normal system management and there will be friction and pain. Some mechanisms may give you problems with running servers having things updated out from underneath them or getting running servers to pick up updates (again this is not a problem if you can reboot servers on a whim).

Some environments aggressively don't want their systems writing to 'local' storage for anything beyond (maybe) configuration file updates. Things like logs should be shipped off the individual machines to log aggregators and so on, while all system modifications and custom setup are obtained through a configuration management system instead of saved on local storage (and it's a feature if sysadmins get conditioned that they can't make local changes on a specific server that stick).

Comments on this page:

By Francis at 2013-11-13 14:59:43:

What you're pointing out is that every method has tradeoffs. That's not new. And choosing the most suitable option is, as always, a question left for the reader.

In this article, without explicitly saying, you are classifying the various methods against some typical goals.

Security, and disposable "server"
The security goal also includes more obscure areas such as, when sometimes you need to qualify a particular configuration against someone else guidelines, and changing them requires re-testing.
Stability of the image
means no changes ever. Some servers once built won't be updated. The functionality of the server is so well known, that you just won't need to do it. Or sometimes this is just organisational reality. Closely related to "disposable server".
Attempt in eliminating unmanaged local storage
Some folks would rather pay for a better network that have local disk. This could be because the servers are far away, or there are so many of them.
Spending lots of money on fancy network disk/ (Availability, Quality) of the net file share service
Some folks spend lots of money on fancy disk systems and the tools to manage them, and see the local disk as unmanaged and unmanageable.
Ideology around different models of centralised control of OS
Some folks just want to control it all from one place (insert maniacal laughter). Or work around all the host-based security that some infosec groups have.

You're almost in "Compute Farm" vs "IO Farm" vs "Distributed System" in this discussion. Those guys are very familiar with these trade-offs.

By cks at 2013-11-13 16:52:24:

I'm not addressing potential reasons for choosing to netboot in this entry for a number of reasons. I'm simply trying to inventory the broad options for it that I see. I like to do such taxonomies for any number of reasons, including that in system administration (as in many other things) the available mechanisms influence the results you can get.

Written on 08 November 2013.
« Why you might not want to use SSDs as system disks just yet
Google Feedfetcher is still fetching feeds and a User-Agent caution »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Nov 8 02:49:48 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.