2012-06-03
Another view of the merge versus rebase debate in version control
One of the persistent debates in modern version control is between merging changes and rebasing them, with the customary pro-merge argument being that rebasing destroys history. From the developer's view this is completely correct; rebasing destroys the history of the changes that are being rebased, causing them to spring into existence fully-formed.
But there's another view you can have on this; you can have the view of a user of the repository, someone (or something) who is pulling and tracking the tip/head of the mainline. And from this user's view, the history that merges preserve effectively doesn't really exist. Before the merge commit it wasn't there at all, and when the merge commit was made all of the changes appeared instantly from nowhere (and they appeared as an indivisible whole; as you step back and forth through the history of the mainline tip, you either have all of the merge or none of it).
The important thing to understand is that this 'user' perspective always sees a linear history of the repository (unless you're doing something very unusual with head/tip), in that every time they update the head marches forward along a continuous unbranching line of development. This means that from the user perspective, a merge is functionally equivalent to a single giant rebase commit; both have the same net effect of causing a large indivisible block of changes to show up all at once.
(A merge is theoretically superior to a single giant rebase in that you can perhaps bisect back through a merge to find a specific broken change. But you should never do single giant rebases; you should rebase with a sequence of single-change commits, which makes bisect even simpler and more likely to actually work in practice.)
Thus I think it's clear that from the user view the best thing is a series of rebased commits. Although they may appear all at once they look as much as possible like separate changes, and as separate commits you can still look at the changes individually and step through them one by one if you need to (and similarly, bisect through them).
Sidebar: a general problem with merges and bisecting
There are a number of practical problems with bisecting through merges, but a general one is that once you start bisecting into a merge you are no longer working in the mainline code. From the user view this means that you are in completely unfamiliar territory where even if the code theoretically works it may lack changes, bugfixes, or features from the mainline that you need.
(The more long-lived and isolated the branch was, the more you may run into this.)
My view on Mercurial versus Git
I've kind of alluded to my views in passing before but since I've already written a certain amount on these two systems (and a chunk of it sort of in favour of Mercurial) I feel like writing about this explicitly, just to be clear.
(You should insert implicit 'in my view' disclaimers in the following if desired.)
For my own use, Mercurial is easier
to start with and use simply, more user friendly, and more 'humane'
(in that in general it works more how people expect). However, Git is technically better, more powerful and complex,
and is more willing to be pragmatic and useful. Mercurial people (at
least in my perceptions) are still somewhat tied to the 'proper' VCS
way to do things; Git people are much more flexible and willing to
compromise in order to do the 'impure' but right thing (the primary
exhibit of this is git rebase
). Git's drawback is that it has far
more exposed complexity than Mercurial does; you cannot really learn or
use Git without understanding things like the index, the (abstracted)
repository format, and so on. But once you do, the good news is that
everything makes sense.
(Saying that Git is technically better may irritate people, but I do feel that it's true. Over and over I've found myself persuaded that Git makes the right fundamental choices on things like repository formats, whether or not to try to track file renames, and so on. I don't currently know enough to have an opinion in the great debate over Mercurial branches versus Git branches.)
The net result is that these days I like Git more and it's what I'm focusing on. Mercurial is okay and I know enough about it to get by, but I would rather use Git for future repos and spend my time learning more about it. This marks a change of my mind from how I was a few years ago (when I found Git intimidating and Mercurial nicely easy), but I figure I'm allowed to do that. I know that there will be a learning curve and some frustrations in using Git, but I'm okay with that; I think that it will be worth it in the end.
(Things for work will continue to be in Mercurial repos, because that's our standard for good reasons.)
As a side note, I would say that Git's flaw is that it has never been willing to compromise or hide its complexity in order to present people with an interface that feels simple and natural. There have been attempts to do so every so often, but as far as I can tell they've never really caught on (and I don't think they've had enabling support from the Git core). The result is a powerful but complex and deep interface that doesn't necessarily operate the way that people start out expecting. This is why I say that Mercurial is more humane than Git; Mercurial has made an effort to have its interface operate in such a simple and natural way, even if it means not offering a certain amount of power to people or hiding it.
Sidebar: the pragmatic perspective
On a pragmatic basis Git has won. I say this for one simple reason: Github. If you work with open source projects (even just using them) you will sooner or later wind up dealing with Github. And if you want to share or show people your open source code, even trivial code, Github is again the platform that people will want you to use.
(Yes, there is a Mercurial equivalent, but Github is far bigger and far more dominant. And yes I believe that you can use Mercurial with a Mercurial to Git bridge to interact with Github if you're stubborn enough and really want to, but let's be honest; you're making life harder for yourself.)
Honesty compels me to admit that this was one large reason I finally started putting things into Git repos; I wanted to put them up on Github, so it was time to get serious about using Git.