Our varied approaches to upgrading machines with local state

December 27, 2022

Our normal approach for distribution version upgrades for our servers is to build a new version of the server on new hardware, then swap it into place by renaming the new server. Sometimes, as for our SLURM cluster, we have sufficiently cattle-like machines that we will take one or a few of them out of service and reinstall them in place. All of this is straightforward for machines that have basically no local state, but this isn't the case for all of our machines. Servers like our ZFS-based NFS fileservers have very significant local state, and our Prometheus metrics server has local data (on fewer disks). Over time, there are several different approaches we've either taken or contemplated for such machines.

If the machine has only a small amount of local state, such as currently queued email and email logs (on our central email server), then we'll usually copy the data over as part of the machine shuffle. We can also use this approach if the data is larger but mostly static, so we can rsync most of it across in advance (we did this when we upgraded our Prometheus server). We may also take this approach for the fileserver with /var/mail, since we can use ZFS snapshots and its good incremental copy support, and this would allow us to switch to new SSDs at the point where we upgrade it.

If a machine has a lot of local data, we try to keep it on a separate set of disks from the system OS. This creates a number of different options for upgrades. If we want to upgrade a server's hardware (for example to move to a current hardware generation), we can install the new version on new hardware but without the data disks, then physically move the data disks over as part of the upgrade. We've upgraded a couple of machines this way in this cycle of Ubuntu distribution version upgrades, and I suspect it's how we'll handle our Prometheus metrics server the next time around.

If we don't want to upgrade the hardware but we have identical spare hardware (or close enough spare hardware), we can do the reverse. We'll install a new version of the server on the spare hardware's system disks, then swap these system disks into the server's normal hardware and bring up the local data disks. This was our process in the past for our NFS fileservers and almost certainly what we'll do for them this time around. Moving a couple of system disks around this way is much less risky than moving a lot of data disks (and it avoids having to reconfigure BMCs and so on). It's also generally faster because we're moving fewer disks.

With separate system and data disks, you also have the option of taking the system down, unplugging the data disks, and reinstalling on the system disks. If you need a specific hardware setup and don't have any spare version of it, you may be forced into this, but so far we've avoided needing to do it. One drawback of this approach is that the production machine is generally down for longer, because a reinstall is typically much longer than a disk swap.

(This issue is on my mind because we're going to have to upgrade our ZFS-based NFS fileservers from Ubuntu 18.04 to 22.04 in the new year, before 18.04 runs out of support. They're our last 18.04 machines because they're the most critical and complicated ones.)

Written on 27 December 2022.
« More use of Rust is inevitable in open source software
Some practical notes on the systemd cgroups/units hierarchies »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Dec 27 22:39:52 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.