How we sort of automate updating system packages across our Ubuntu machines

March 8, 2020

Every place with more than a handful of Unix systems has to figure out a system for keeping them up to date, because doing it entirely by hand is too time consuming and error prone. We're no exception, so we've wound up steadily evolving our processes into a decently functional but somewhat complicated setup for doing this to our Ubuntu machines.

The first piece is a cron job that uses apt-show-versions and a state file to detect new updates for a machine and send email listing them off to us. In practice we don't actually read these email messages; instead, we use the presence of them in the morning as a sign that we should go do updates. This cron job is automatically set up on all of our machines by our standard Ubuntu install.

(Things are not quite to the point where Ubuntu has updates every day, and anyway it's useful to have a little reminder to push us to do updates.)

The second piece is that we have a central list of our current Ubuntu systems. To make sure that the list doesn't miss any active machines, our daily update check cron job also looks to see if the system it's running on is in the list; if it's not, it emails us a warning about that (in addition to any email it may send about the system having updates). The warning is important because this central list is used to determine what Ubuntu machines we'll try to apply updates on.

Finally, we have the setup for actually applying the updates on demand, which started out as a relatively simple Python program that automated some ssh commands and then grew much more complicated as we ran into issues and problems. Its basic operation is to ssh off to all of the machines on that central list of them, get a list of the pending updates through apt-get, then let you choose to go ahead with updating some or all of the machines (which is done with another round of ssh sessions that run apt-get). The output from all of the update sessions is captured and logged to a file, and at the end we get a compact summary of what groups of packages got updated on what groups of machines.

I call our system sort of automated because it's not completely hands off. Human action is required to run the central update program at all and then actively tell it to go ahead with whatever it's detected. If we're not around or if we forget, no updates get applied. However, we don't need to do anything on a per-machine basis, and unless something goes wrong the interaction we need to do with the program takes only a few seconds of time at the start.

(We strongly prefer not applying updates truly automatically; we like to supervise the process and make final decisions, just in case.)

Not all packages are updated through this system, at least routinely. A few need special manual procedures, and a number of core packages that could theoretically be updated automatically are normally 'held' (in dpkg and apt terminology) so they'll be skipped by normal package updates. We don't apply kernel updates until shortly before we're about to reboot the machine, for example, for various reasons.

Our central update driver is unfortunately a complicated program. Apt, dpkg, and the Debian package format don't make it easy to do a good job of automatically applying updates, especially in unusual situations, and so the update driver has grown more and more features and warts to try to deal with all of that. Sadly, this means that creating your own equivalent version isn't a simple or short job (and ours is quite specific to our environment).


Comments on this page:

From 104.179.118.47 at 2020-03-08 11:01:14:

Is there a reason this couldn’t be a bash script thar invokes pdsh?

By cks at 2020-03-08 15:25:17:

Roughly, there are three complicated things the drive program does. First, it parses and aggregates the output from the initial apt-get scan because we have far too many machines to list out pending package updates machine by machine. Second, it similarly parses the output from the actual upgrade to sort of determine what actually got upgraded; for various reasons we can't assume that the scan results are still accurate. Finally, the most complicated thing it does is that it runs the actual upgrade process in a 'ssh -t' session and allows us to interact with it if we want to or if it appears to be stalled (for example because a package update is demanding answers to questions).

You could probably do the first two things in a bash script (you'd want to do the actual parsing in awk or something). I think the third is very difficult in such a script, especially if you want the output to be hidden until either there's a stall detected or the person running the driver program decides to break into interacting with the session.

(You might be able to do it by running the whole thing under script inside a screen or tmux session that the bash script manipulated. That would let the driver script normally keep the session output hidden from you in another screen, but flip to it if there might be problems.)

By Simon Deziel at 2020-03-09 10:26:19:

I've been happy with 'apt-dater' to manage various environments.

Written on 08 March 2020.
« Linux's iowait statistic and multi-CPU machines
What makes our Ubuntu updates driver program complicated »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Mar 8 03:39:14 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.