System administration's long slow march to configuration automation

May 29, 2023

Dan Luu recently asked about past and current computing productivity improvements that were so good that it was basically impossible for them not to get adopted. In my reply, I nominated configuration management (here):

From a system administration perspective, the move from hand-crafting systems to automating their setup (or as much of it as possible) feels both transformative and so obviously compelling to practitioners that you hardly have to sell the idea.

(The end point of that today is containerization and k8s/etc, but I'm not sure today's endpoint will persist.)

Although I claimed that you hardly had to sell the idea, in retrospect this is a bit overblown. Automated configuration management (at least of Unix machines) has spent a very long time slowly cooking away and slowly being adopted more and more broadly and more and more commonly. To see some of this, we can look at the initial release dates in Wikipedia's comparison of open-source configuration management. The oldest one listed, CFEngine, goes back to 1993, then there are some from the late 1990s, and then a flowering starting in the 2000s. If we take formal package management and automated installers as part of this, then those are present too by the late 1990s (in Linux distributions and elsewhere). And all of these ideas predate formal open source systems for them; people were trying to do package management and configuration management for Unix systems in the late 1980s, using hand-crafted and sometimes very complex systems (interested parties can trawl through old Usenix and LISA proceedings from that era, where various ones got written up).

One of the reasons that these things cropped up and keep cropping up is that the idea is so obviously appealing. Who among us really wants to manage systems and packages and so on by hand, especially across more than one machine? Very few system administrators actually like logging in to machine after machine to do something to them, and we famously script anything that moves, so the idea of automation effectively sells itself.

But despite all of this, automated configuration management as a practice didn't spread all that rapidly. For example, my memory is that the idea of 'pets versus cattle' and the related idea of being able to readily rebuild your machines only really became a thing in the field when virtual machines and VM images started to become a thing in the late 2000s or early 2010s. Certainly many of the configuration management systems listed in Wikipedia date from around then (although Wikipedia's list may be subject to survivorship bias on the grounds that most people are interested in still-viable systems).

I'm not sure that anyone in the 2000s or even the late 1990s would have argued against the abstract idea of automated configuration management. However, I suspect that many people would (and did) argue or feel that either it wasn't necessary for them in their particular situation (for example, because they only had a few machines, and perhaps those machines had critical state such as filesystem data), or that the existing configuration management systems didn't really fit their needs and environment, or that the existing systems would be too much work to adopt relative to the potential payoff. Even people who wound up with a decent number of systems could be in a situation where they'd evolved partial local solutions that worked well enough, because they'd started out too small to use configuration management and then scaled up bit by bit, without ever hitting a cut-over point where their local tools fell over.

So my more nuanced view is that we've wound up in a situation where the appeal of automating system setup and operation is obvious and widely accepted, but the implementation of it still isn't. And where the implementation is widely accepted it's partly because people are using larger scale systems that don't give them a choice, like more or less immutable containers that must be built by automation and deployed through systems.

(Perhaps this mirrors the state of other things, like Continuous Integration (CI) build systems.)


Comments on this page:

From 193.219.181.219 at 2023-05-30 09:01:51:

The oldest one listed, CFEngine, goes back to 1993, then there are some from the late 1990s, and then a flowering starting in the 2000s

I would probably include rdist in this as well, which is from 1980s. Sure, it appears to be more for distributing software rather than config files (i.e. static files without templating), but it still counts.

Circa 2012, my team was maintaining ~5k machines as part of the university's research computing infrastructure. You'd better believe we used configuration management: kickstarts for initial deploy and CfEngine 2 (and later Puppet) for ongoing management. However, even then, our friends down the hall who maintained ~130 servers for the central IT Unix/Linux infrastructure were still hand-maintaining configuration.

Yes, as Ben says, I think automated installers like kickstart / jumpstart / etc. need to be part of the configuration management story. With that basis you can go a long way with ad hoc scripts.

I have used the standard combination of the four crucial tools for systems engineering:

  • Some sort of config management (first simple minded self built and then and early adopted of CfEngine and a few others).
  • Version management (starting with SCCS and then RCS).
  • Documentation database (started with text files in SCCS, then DocBook in RCS and CVS).
  • Issue tracking (started with newsgroup, then mostly Trac and RT).

But I also have a lot of liking for "integrated" suites like Trac, and I would like FOSSIL except that I never used it extensively.

The problem is that regardless of tooling what is required is great discipline in data architecture and coding design (to handle well many-to-many relationships and "mixin" inheritance), as it turns a lot of system engineering into significant programming projects, and most programmers, never mind systems engineers, do not have a lot of discipline that way. But I have written some suggestions:

https://sabi.co.uk/blog/19-one.html?190615#190615 "Keeping separate configuration data and templates"

http://www.sabi.co.uk/blog/17-two.html?171029#171029 "The "mixin" and "flavors" problem for configuration"

http://www.sabi.co.uk/blog/13-one.html?130404#130404 "Nagios or Icinga, a better configuration style"

https://www.sabi.co.uk/blog/0701jan.html?070120b#070120b "Simple configuration-driving environment variables"

Written on 29 May 2023.
« My current editor usage (as of mid 2023)
Some tricks for getting the data you need when using bpftrace »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon May 29 22:51:27 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.