2011-05-18
Why open source projects should use 'git rebase
' or the equivalent
One of those 'vigorous debates' in version control is whether you should make frequent use of rebasing changes in order to present a clean version history or whether you should preserve the original, real development history of changes, warts and merges and all. As it happens I am a somewhat involved bystander in this, so I have a grumpy sysadmin's answer: if you are an open source project of moderate size or larger, you should absolutely rebase patches. In fact it would be good if you went further than that.
Why is simple: git bisect
and equivalents in other DVCSes. The more
you have working bisection, the more outside people (like me) can send
you problem reports about your new release or beta that include the
magic phrase 'and this specific change is where it broke'.
(In many cases, this phrase drastically reduces how much time you have to spend debugging and fixing problems.)
When it works, bisection is marvelous. But, speaking from personal experience, when I am trying to bisect through strange code and I hit an unbuildable or untestable tree, I pretty much give up on the spot. I simply don't know enough about your project to deal with the issues of bisecting through and around a bad tree (especially given the paradox of too detailed bug reports).
When I'm bisecting, having a project history that includes every partially done modification, misstep, checkpoint, and failed approach that a developer ever made is not a feature. Even if the tree builds and is testable, the presence of partially complete modifications may mean that my bisection confidently declares that the problem changeset is halfway through the development of a modification. This is technically correct but almost certainly not useful, because what you want me to tell you is which fully developed change created the problem.
(Now that I think about it, part of the problem with bisection here is that a binary search is not quite the right model for what you want to do, at least in the general case. But that's another entry, once I've thought about this a bit more.)
So: once your project is large enough that it's helpful to have outside people bisecting things to find where problems were introduced, you want a clean history to enable this as much as possible. Thus you want to rebase, and ideally you would have a rule that all changes committed to the master tree must leave it buildable and testable.