Another reason why version control systems should support history rewriting

July 28, 2011

In Wait, Not That Bit!, Greg Wilson writes about the problem of making a bunch of unrelated changes to a single file and then having to commit a big bullet list of changes. A discussion of splitting existing changes into multiple commits and how you test the resulting separate commits then ensued in the rest of the entry and the comments.

However, I'd like to note that there is a fundamental conflict inherent in this workflow. We want VCS commits to be very easy and lightweight so that developers will actually do them (and do them frequently), instead of developing and checkpointing things outside the VCS because it's more convenient. At the same time we want each VCS commit to be for a single separate change, and we want the change (and the commit) to pass tests. These goals are in conflict, and the discussion in Greg Wilson's entry is one sign of it; all of the proposed solutions involve a developer who has a finished chunk of code going through more work before they can capture it in their VCS. What happened to 'commit early, commit often'?

(Among other things, my strong opinion is that as a developer I want to be able to snapshot the code the moment that it actually works. Working code is precious and fragile. Changing the code without a snapshot after it works is an invitation to accidents, mistakes, damaged code and heartburn.)

The reality is that you want to do two sorts of commits here; as you develop you want to capture state (especially 'okay, this code works, make sure I don't lose it'), and once you're done you want to capture distinct changes. Or rather, once you're done you want to turn your captured state into a series of separate changes (and test them one by one). This sounds more or less exactly like 'rewriting history'; you start out with a history that is a series of state snapshots (the degenerate case is a single snapshot) and rewrite it into a history of separate changes. Then you publish only the second, proper history.

Once you're going to be rewriting history, it should be supported in your VCS for reason that I've written about before.

(You can try to argue that your VCS should be used only for the final commits and the state snapshots should be handled through some other mechanism or program. I feel that this is a mistake; among other things, this other mechanism is effectively a version control system itself and that means that someone had to write it, duplicating much or all of the work of writing your main VCS.)

Written on 28 July 2011.
« Why not YP, er, NIS
A directory service doesn't make it easy to disable user accounts »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Jul 28 02:01:50 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.