Wandering Thoughts archives

2009-11-23

Converting a directory from RCS to Mercurial

Suppose that you have a directory full of configuration files that have been there for so long that they're still being maintained with RCS. Further suppose that you would like to change to a modern version control system, say Mercurial, but that you would like to preserve all of your old version history.

Mercurial has no direct support for converting RCS files, but there's a magic trick: a CVS repository is nothing more than a bunch of RCS files in a directory hierarchy plus a thin layer of easily created metadata, and a lot of things (Mercurial included) can convert CVS repositories. So we first make a CVS repository version of our directory, and then convert that repository to Mercurial.

Before you start, you need to clean up your current data by making sure that everything you want to have included in the new repository is under RCS, and that you don't have any lingering RCS ,v files for files that you've taken out of service. If you do have old ,v files and want to preserve their history in the new repository, you'll need to remember to tell Mercurial (or your VCS of choice) that they're deleted after you finish the repository conversion.

(It's relatively common for us to remove the checked out version of a file but keep the ,v file both just in case and for historical purposes. You may be different.)

Using the example of a directory (or directory hierarchy) called nsdata, here's the steps, in two parts. We'll work in /tmp/, for convenience.

(As always, I must note appropriate disclaimers. You should always carefully test both procedures and end results, and while this has worked for us, I can't promise that it will work for you.)

Creating a CVS repository version of your RCS-controlled directory

  1. Create an empty CVS repository to get the CVS metadata:
    cvs -d /tmp/scratch-CVS init

  2. put a copy of the nsdata directory into /tmp/scratch-CVS/nsdata with the tool of your choice (I used rsync, because I use rsync for everything like this). In CVS terminology, this creates a repository module called 'nsdata'.

  3. Turn it into a correctly laid out CVS repository. You've probably got all of your RCS ,v files in RCS subdirectories, but CVS puts them directly in the directory that the checked-out file goes in. So you need to move all of the ,v files up one directory level, out of their RCS subdirectories:
    cd /tmp/scratch-CVS/
    find nsdata -type d -name RCS -prune | while read r; do mv -i "$r"/* "$r/.."; rmdir "$r"; done

  4. create a checked out version of the CVS repository:
    mkdir /tmp/scratch-CO; cd /tmp/scratch-CO
    cvs -d /tmp/scratch-CVS co nsdata

    This is where the CVS module terminology becomes important; you are checking out the 'nsdata' module from your CVS repository, which creates a /tmp/scratch-CO/nsdata directory hierarchy.

You should be able to diff -r this checked out CVS module against your current directory and not see any significant differences. (Your checked-out version will have CVS directories and not have RCS ones.)

If you prefer something besides Mercurial, you can now use the CVS-to-whatever tool of your choice. The rest of this entry is specific to the CVS-to-Mercurial conversion process.

Converting a CVS repository into a Mercurial one

Unfortunately, you're also going to want to do the conversion with the latest version of Mercurial (version 1.4 as of writing this), which may mean that you need to build it yourself. Old versions of Mercurial do a worse job of the conversion, and if they are sufficiently old, they actually don't do it correctly. Once you've converted the repository, you can use the normal system version of Mercurial to work on it.

So, the steps:

  1. optionally, go through your RCS history to find out all of the Unix userids that have made RCS checkins, and create a file that maps from the Unix userid to something more conventional for Mercurial, such as an email address. See Mercurial's 'hg help convert' for information about the format of this file; let us assume that it is /tmp/authormap.

  2. create a Mercurial version of your CVS repository:
    hg convert --authors /tmp/authormap --datesort /tmp/scratch-CO/nsdata /tmp/nsdata-hg

    Some Mercurial documentation recommends avoiding --datesort. This is wrong for our particular case; here, your changesets really are in strictly chronological order, and you want the converted repository to reflect this.

    If you doing the conversion with a self-built copy of the latest Mercurial on Ubuntu 8.04 LTS or any other system which has a pre-1.1 version of Mercurial, you will need to add an extra argument so that you can use the system version of Mercurial on the repository:

    hg --config 'format.usefncache=0' convert ...

    (See here for a discussion of this.)

    On Ubuntu 8.04 LTS you definitely want to use the latest Mercurial to do the conversion; Mercurial 0.9.5 has a bug that will give you incorrect file contents (reversing some changes) under some circumstances.

  3. clean up the repository and check out the current versions of all files:
    cd /tmp/nsdata-hg
    hg purge; hg update

(If you did the conversion with a sufficiently modern version of Mercurial, you don't need the 'hg purge'.)

The end result of this is a new Mercurial repository in /tmp/nsdata-hg with the full history and the current version of all files in the repository checked out. You should be able to diff -r this against the current directory of configuration files and see no important differences. (The Mercurial repository will have a .hg directory and not have RCS directories.)

My experience is that the history of the Mercurial repository will show at least some multi-file changesets, although it doesn't seem to capture all of them. I choose to view this as an improvement over having all changes be single-file changes, even if it's not perfect.

(Presumably the conversion process (or CVS) uses various heuristics to decide when changes to multiple files more or less at once actually are a single changeset.)

Sidebar: resources and credits

I didn't come up with this on my own; a number of web pages provided very valuable information and pointers.

sysadmin/RCStoMercurial written at 23:39:46; Add Comment

My current unhappy thoughts on Fedora 12

Right now, I have two machines (a 64-bit desktop and a 32-bit laptop) at Fedora 11 and one (another 64-bit desktop) that's still back at Fedora 8. Upgrading to Fedora 12 soon is the obvious thing to do, since there are drawbacks to waiting too long to upgrade (although this is not an issue if you use PreUpgrade or yum-based upgrades).

Except, well, I haven't been having the best of luck with Fedora 11. On my desktop, Flash has been broken for all of Fedora 11 (and the free alternatives don't work for me), and then sound stopped really working in the 2.6.30 kernels. On my laptop, wireless stopped working with the 2.6.30 kernels (I'm detecting a trend). It doesn't seem likely that upgrading to Fedora 12 will fix those problems, especially the kernel related ones.

(I have not bothered filing a bug for the sound issue, because my impression is that sound is a huge mess in Fedora right now and worse, it is partly a political mess. I've certainly seen Fedora bugs of 'my sound card stopped working' be answered with replies that boil down to 'well, that's what you get for buying a sound card from people who aren't open source friendly'.)

I would like to upgrade to Fedora 12; I generally like getting the new stuff (although not always), and it avoids various future issues. But upgrading doesn't seem like a wise decision right now, and I'm not convinced that it ever will be; I have no confidence that any of my issues will get solved over the lifetime of Fedora 12.

(My cynicism suggests that things that stay broken in kernels for more than a relatively short amount of time stay broken for good, because no one cares enough to try to fix them. I can't blame the kernel hackers; I certainly don't have enough energy to try to build stock kernels and git bisect my way to the changeset that broke wireless. Not on what is an old and slow system that, for now, works fine when I stick to an older Fedora 11 kernel.)

What this really leaves me nervous about is the further future. If these issues aren't fixed in Fedora 12, will they be fixed in Fedora 13, Fedora 14, and so on? The odds seem against this, and there's only so long I can run Fedora 10 and Fedora 11.

Sidebar: dealing with my Fedora 8 machine

This means that I should bite the bullet and do the odd thing of upgrading the Fedora 8 machine to Fedora 10. Yes, it's just about to go out of support, but I don't really have a choice; it's the most recent Fedora where Flash worked for me in a 64-bit environment.

linux/ConsideringFedora12 written at 00:47:08; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.