VCS history versus large open source development

October 19, 2021

I recently read Fossil's Rebase Considered Harmful (via), which is another rerun of the great rebase versus everything else debate. This time around, one of the things that occurred to me is that rebasing and an array of similar things allow maintainers of large, public open source repositories to draw a clean line between how people develop changes in private and what appears in the immutable public history of the project. Any open source project can benefit from clean public history, partly because clean history makes it easy to use bisection to locate bugs, but a large project especially benefits because it has so many contributors of varying skill levels and practices.

(In addition, consumers of public open source repositories often already see a linear view of the project's code history.)

Another aspect of using rebasing and other things that erase history (such as emailed patch series) is that they free people to develop changes in whatever style of VCS usage they find most comfortable and useful. You can set your editor to make a commit every time you save a file, and no one else has to care in the way they very much would if you proposed to merge the entire sequence intact into a large, public open source repository. The more contributors you have (and the more disparate they are), the more potentially useful this is.

Of course, there's a continuum, both between projects and in general. It's undeniably sometimes useful to know how a change was developed over time, for various reasons. It can also be useful to know how a change has flowed through various public versions of the code. The Linux kernel famously has a whole collection of trees that changes can wind up in before they get pulled into the mainline, and when this is done the changes often continue to carry their history of trees. Presumably this is useful to Linus Torvalds and other kernel developers.

One way to put this is that as an open source project grows larger and larger, I think that it makes less and less sense to try to represent almost everything that happens to the project in its VCS history. VCS history is only one way to capture and handle the entire history of the project; using it for everything has the same sort of broad problems that using any single thing for everything has. Perhaps the larger your project is, the more you should be explicitly asking what your VCS history is for and how you want it to be used (and to be useful).

Written on 19 October 2021.
« The cut and paste irritation in "smart" in-browser text editing
In the beginning, there was no way to expand C's stack size »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Oct 19 23:24:48 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.