Keyword expansion: why RCS rewrites files when you commit them

September 12, 2010

Given my previous entry, you might wonder why RCS does such a crazy thing as deleting and recreate files when you just make a commit (in RCS terms, a checkin) when modern version control systems don't do this. The answer turns out to be pretty simple: keyword expansion. Specifically, that RCS supports keyword expansion and modern version control systems don't.

RCS has a feature where you can embed magic keywords in your source code that are automatically expanded to various pieces of information when RCS checks out your file. The classical example is various identification of the file version; the traditional use for this is embedding it in C source as a static string, so that it will appear in the binary and can later be extracted to determine which source version was used to build some random binary version.

(Assuming, that is, that the binary was built from unedited source files, with no uncommitted edits. You may be starting to see the problems here.)

This sounds like a very convenient feature, but it has a cost; it means that the file's proper contents change every time you commit it. Thus, every time you commit a file you (may) need to update the checked out version in order to make its keyword expansions have their proper value. Which means deleting and recreating files a lot, at least if you are RCS.

(In theory RCS could notice that the committed file does not have any keywords to expand and thus doesn't need to be recreated, which is the usual case these days. In practice it is not that smart.)

This also complicates various other aspects of RCS; the obvious one is checking for differences between live files and the repository. This needs to scan the live file for keywords and de-expand them in order to avoid spurious differences, which naturally slows down and complicates the process.

These complications are a large part of why modern version control systems have by and large strongly rejected keyword expansion (cf Mercurial or git). Not having keyword expansion is occasionally inconvenient for users, but it makes modern VCSes significantly faster and more attractive.

Written on 12 September 2010.
« One aspect of getting used to modern version control
Why you want a recursive-forwarding caching nameserver »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Sep 12 01:25:44 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.