Why I am not enthused about etckeeper and similar systems

December 8, 2009

There are a number of programs like etckeeper, systems for keeping your /etc and similar things in various forms of version control system repositories. I'm not enthused about any of them as compared to, say, keeping selected files in /etc in RCS, because I see several problems with them.

First off, you're fighting with your regular package management system; both it and your etckeeper system feel that they own files in /etc and can change them. Even if you try to integrate the two systems together, my general experience is that this is going to cause you and your system heartburn in the long run.

Next is the mixed directory problem of shoving a lot of otherwise unrelated files into a single repository. (That one's big enough that I gave it its own entry.)

Finally is the problem that as far as I know, no current VCS natively preserves all of the file attributes that you want preserved for files in /etc and other system areas. You really do want the permissions, the ownership, and at least the modification timestamp preserved, and these days you may need ACLs and security labels and so on as well. While some systems will try to preserve (some of) this information, they're doing so outside of the VCS itself, and un-integrated workarounds of VCS limitations generally have their own problems.

All of this seems like a lot of work and problems in exchange for what is basically a point in time history of /etc, with the ability to revert things under certain circumstances. (Basically, as long as there are not package updates or changes that you would also need to roll back.)

Now, to be fair I should present the other view.

The devil's advocate view is that using a system like this is easier than using RCS on everything that you change, and it also gives you a history of what the packaging system changed. You can get around many of the file attribute problems most of the time by using a VCS that never alters the live version of the file unless you explicitly tell it to; then the workarounds are only necessary if you have to revert to an older version of a file, and you can assume that that's a rare event.


Comments on this page:

From 76.121.86.209 at 2009-12-08 04:16:34:

When has etckeeper changed anything on you? AFAIK if you are using etckeeper with git as a backend the only thing etckeeper would ever touch is things under /etc/.git and the file /etc/.etckeeper. I have never seen etckeeper make any changes anywhere else. I haven't used any of the other VCS backends etckeeper supports though.

From 143.48.3.13 at 2009-12-08 12:47:44:

This view may absolutely work for your environment, or other environments where you have single administrators in charge of making changes to swaths of systems. However, in the general case, I believe that you are 100% dead wrong.

I'm going to reorder the parts I'm quoting a little bit, since some points build on other points.

Next is the mixed directory problem of shoving a lot of otherwise unrelated files into a single repository. (That one's big enough that I gave it its own entry.)

You're free to use subprojects/submodules however you like, but for the most part, this is how software projects operate. The Linux kernel has a virtual filesystem module and a device driver for a video adapter in the same tree. It doesn't seem to cause a ton of problems for many people. As the Linux kernel is a single project that performs multiple discrete functions, so too the software stack you manage on your systems, which provides a set of services to end-users. The bigger issue is having the diligence to group your commits coherently and logically, so you don't end up with changes to all of these different system areas grouped into a single changeset.

All of this seems like a lot of work and problems in exchange for what is basically a point in time history of /etc, with the ability to revert things under certain circumstances. (Basically, as long as there are not package updates or changes that you would also need to roll back.)

You're completely understating (or outright missing) the value of this; I imagine that you come from the kind of environment where change control/change management is neither viewed as beneficial nor necessary. When you have multiple administrators working on a system, particularly when some of them are server owners unaffiliated with IT who demand root-level access to the system because they bought it with their own grant money, it's extremely valuable to be able to audit changes and present them in some kind of coherent view. Repository browsers like Redmine that can aggregate changesets from across multiple subprojects make it very easy to get an at-a-glance view of exactly what changed, when, and (provided your administrators use good commit messages) why.

Being able to easily view what changed is precisely the value of using a revision control system in the first place. If you're not going to leverage that functionality, and all you need the RCS for is functionality as a very poor backup system, you might as well just stick with your existing backup system and skip the extra set of commands for your administrators to learn.

The same applies to nearly any use of revision control, whether in system configurations or in software code. Revision control systems are useful for two things: branching and merging change sets, and figuring out where your regressions came from. Often, only one of these is useful in a sysadmin context. To summarize: keeping a version history of your files so you can restore old versions isn't revision control, it's keeping backups.

"Finally is the problem that as far as I know, no current VCS natively preserves all of the file attributes that you want preserved for files in /etc and other system areas. You really do want the permissions, the ownership, and at least the modification timestamp preserved, and these days you may need ACLs and security labels and so on as well. While some systems will try to preserve (some of) this information, they're doing so outside of the VCS itself, and un-integrated workarounds of VCS limitations generally have their own problems."

They certainly do have their own problems, but it's nothing compared to the problem of not having that information at all. Consider again that the major role of a revision control system in a sysadmin context is change management. If an administrator needs to be constantly rolling back their system configurations from revision control, there is something very, very wrong with their QA and your release engineering processes. In these instances, they don't need a revision control that preserves every tidbit of file metadata; they need to stop pushing untested changes to production systems.

The devil's advocate view is that using a system like this is easier than using RCS on everything that you change, and it also gives you a history of what the packaging system changed. You can get around many of the file attribute problems most of the time by using a VCS that never alters the live version of the file unless you explicitly tell it to; then the workarounds are only necessary if you have to revert to an older version of a file, and you can assume that that's a rare event.

I have never, ever had a revision control system (CVS, SVN, Mercurial or Git) alter a file unless I was explicitly updating my working copy from an upstream repository or rolling back a file to a prior revision. Can you clarify?

--Jeff

By cks at 2010-09-12 00:47:04:

Whoops, department of very belated replies (I thought I'd written this before but, um, nope):

RCS alters files when you commit them, or at least usually does in default configurations (it deletes the file then recreates it). Modern version control systems do not. This difference catches me out every so often, such as here, because I've used RCS long enough that its behavior is just ingrained in my mind as 'just the way things are'.

Written on 08 December 2009.
« Why whitelists (and blacklists) are long-term poison for online systems
My views on inheritance versus interface »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Dec 8 01:08:18 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.