Modern version control systems change your directory layouts

November 27, 2009

We're in the process of a slow conversion from maintaining files with RCS to mostly keeping them in Mercurial repositories, which has wound up making me think about how we want to structure them and which of our existing directory areas are easy to convert. As a result of this, I have an observation: modern whole-directory VCSes change how sysadmins want to lay out files and directories.

In the old days of single-file version control, it made perfect sense to group files together in directories by function or system, regardless of how they were generated or updated. For example, you might put all of your local MTA-related files in a single directory, with some of them being automatically generated and some of them hand edited. You might even mix in binaries used by scripts and so on.

(The automatically generated files might or might not be maintained under RCS; the hand edited ones definitely would be.)

In a whole-directory VCS, this is a bad idea; you want all of those non version controlled files to be in a completely separate directory somewhere that's outside of the repository; you want the repository to only have 'source', not 'compiled things'. Otherwise, as a minimum you're going to wind up with a bunch of file exclusions (and it's my opinion that file exclusions are fragile).

(Unlike with RCS, you want to do this even if you keep the automatically generated files under version control, since an automatic checkin of new versions of the generated files could easily collide with other work you're also doing in the repository.)

There's two approaches for dealing with this that I can think of offhand. You can have your overall configuration know that some files are found in one directory and other files are in another, or 'publish' files from the version control repository to the directory that has the automatically generated files. Of these, I prefer the two directories approach on the ground that it's more robust (you don't have to remember an extra magic step to make changes go live).

(By the way, if you are building such a system you almost certainly don't want to require changes to be committed to the repository in order to be activated. Doing 'check out current version' or the like is tempting, but it's going to make testing much harder.)

Comments on this page:

From at 2009-11-27 16:38:30:

In our environment, we historically prefer the additional extra step to make changes live. So our configuration files (some hand-edited, some script generated) all live in the SCM in a working directory which is not the deploy director[y|ies]. Then the changes are made live (historically with a Makefile which rsyncs or scps the files out of the working directory to the deploy directory, optionally with a remote kill -HUP to the daemons which need it) and validated before being committed to the SCM repository.

In general this state arrived organically but it has a few nice qualities to it, such as having everything in one place (in a tree structure in SCM) for easy greps, and being able to push out the same config to multiple redundant systems from one point, as well as being able to deploy to systems from the working space which we might not want to deploy SCM binaries or information to because of paranoia or inability to execute the binaries or reach the repository.

We can also, and have needed to, revert the changes and immediately repush the last known good configuration. That doesn't suck.


From at 2009-11-29 00:40:43:

In my configuration, we actually have Git managing /, but .gitignore and some associated scripts are specially tuned to only pay attention to a certain subset of directories like /etc, /var/named, /var/ossec/etc, /usr/local/etc and so forth. This keeps everything in a single repository, while giving us the flexibility to do a lot of other stuff with it.

For a lot of reasons, this approach didn't work with Mercurial, so your mileage may vary.


Written on 27 November 2009.
« Some notes for myself on git bisect
'Conditional restart' in init.d scripts can be dangerous »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Nov 27 01:03:33 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.