Our approach to configuration management

June 19, 2013

A commentator on yesterday's entry suggested that we're already using automated configuration management, just a home-grown version of it. To explain why I mostly disagree I need to run down the different sorts of configuration management as I see them:

  1. No configuration management: you edit configuration files in place on each individual machine with no version control. Your best guess at what changed recently is with 'ls -l' and your only way back to an older configuration file is system backups.

  2. Individualized configuration management: you use some sort of version control but it's done separately on each individual machine. Rebuilding a copy of a dead machine is going to be a pain (and involve restoring bits from system backups).

  3. Centralized configuration management: you have a central area with canonical copies of your configuration files for all of your machines (under version control of some sort because you aren't crazy). But you still have to update machines from this by hand (or make changes on the machine and then copy them back to this central area).

  4. Automated configuration management: when you change something in your central area it automatically propagates to affected machines. You don't have to log in to individual machines to do anything.

For the most part we have centralized configuration management, with the master copies of all configuration files living on our central administrative filesystem, but not automated configuration management. Only a few things like passwords and NFS mounts propagate automatically; everything else has to be copied around in an explicit step if we change it (sometimes by hand, sometimes with the details wrapped up in a script we run by hand).

(Actually now that I think about it we have a surprising amount of automatic propagation going on. It's just all in little special cases that we usually don't think about because, well, they're automated and they just work.)

I could give you a whole list of nominally good reasons why we aren't automatically propagating various things, but here's what it boils down to: if sysadmins are the only people changing whatever it is, it doesn't change very often, and it doesn't have to go everywhere, we haven't bothered to automate things because it doesn't annoy us too much to do it by hand. When one or more of those conditions changes we almost invariably automate away.

(That actually suggests a number of openings for a system like Puppet. For a start it can probably handle the actual propagation on command instead of having us manually copy around files.)


Comments on this page:

From 83.208.138.230 at 2013-06-20 19:06:16:

Hello Chris,

i think we are almost at the same point - automating installation and then automate some synchronization processes (firewall, apache, etc betwen HA pairs). the one thing that is problematic, is development of the "gold instalation standatd". when i make some changes, sometimes it's more work to get all the older machines to the new standard state. Do you solve this some way, or the machines are singletons even in the time?

this could be the place for some standardized configuration management tool. but i'm waiting with bringing this to other my collegues, because normal management tasks means, that admin shlould know how to do it with the management tool.

One way i consider, is work with the state more like with code - when machine is installed, clone the standard at that point in time (maybe versioned some way) and then just do required changes. when new standard installation is released, it's on the the admin to merge the new standard with the originaly used and move the machine to newer state. it minimizes the problematic rolling system standard update on production systems.

Have nice day, -jhr.

From 174.116.147.208 at 2013-06-21 00:51:07:

jhr wrote:

normal management tasks means, that admin shlould know how to do it with the management tool.

My shop is just starting to use puppet. From what I have seen it doesn't hide too many of the normal management tasks from an admin. At least puppet doesn't need to hide these details. It can be used with many default options (that seem mostly sane). The defaults are like any rpm or deb package defaults. Either the default config is good enough, or you are going to need to know what you are doing. In those cases you could use Puppet to push out your custom configuration files.

What Puppet does do is take away a lot of the drudge work that no sysadmin should need to worry about. Like keeping a bunch of machines up to standard template.

One way i consider, is work with the state more like with code - when machine is installed, clone the standard at that point in time (maybe versioned some way) and then just do required changes. when new standard installation is released, it's on the the admin to merge the new standard with the originaly used and move the machine to newer state. it minimizes the problematic rolling system standard update on production systems.

Puppet does all this sanely. For one thing it is code. Puppet Labs recommends that you use git to manage the configuration files. Puppet takes the guess work out of the the "required changes." It will tell the admins what needs to be changed locally and can report back to a central server. This helps to remind the lazy admin to apply the security patches, or to find that server the intern setup and forgot to document. Puppet does not need to run automagically. Admins can manually apply changes through puppet. They can even choose what should be applied to each server/workstation/node. It is also really easy to setup staging machines to test the configuration before applying it to production.

I assume that CFengine and Chef are similar.

However like any tool, you need to be careful with Puppet/CFengine/Chef. It is easy to see every sysadmin problem as a puppet problem. You could easily spend a lot of time to write a puppet module to handle a corner case that you could more easily manage manually, or with a more specific tool. I don't see us managing our few oracle DBs with puppet. I do see us putting our base install in puppet to install the extra packages we like, setup centralized authentication, ntp, mail to use the smart host, and configure yum and apt to use our local repositories.

From 174.116.147.208 at 2013-06-21 01:12:18:

cks: I want you to try puppet just so that I can read your rant about it.

Puppet has some "regretful" syntax. There are arbitrary design decisions that make no sense given other arbitrary design decisions. For example, some key words are pluralized, when others, used in a similar context, aren't. Some declarations are really function calls. Finally, the file system layout within a puppet module makes so little sense that there is a cheat sheet. My course instructor gleefully pointed out many of these inconsistencies.

Once you get past the foolishness it looks solid, and there is a nice community around it.

Written on 19 June 2013.
« What's in the way of us using automated configuration management
The question of whether to rewrite an old but working service »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jun 19 00:29:16 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.