A DVCS advantage for open source development

April 3, 2010

Recently, an interesting advantage of DVCSes for open source development has occurred to me: their very nature makes it so that the initial source of an open source release cannot really reverse itself.

Suppose that you are a company that might want to retract and de-release something that has been released as open source. With traditional non-distributed version control, you could simply shut down the public source server for the project; while people who already had copies of the source base could in theory put together another public source server, it would be a moderate hassle and they'd lose the project history. With a DVCS, everyone has what you have and setting up a new public source server takes about as long as pushing the source to one of the hosting services such as GitHub.

This is conventional, but DVCSes give you even stronger protection than this. Put succinctly, since DVCSes do not allow 'rewriting history' using a DVCS means that the project lead cannot commit things to the tree that destroy the history that you already have copies of. With a non-DVCS, a clever company could commit things to the tree to wipe things out or otherwise destroy the source tree's usefulness and then wait for people to update; since you only have the current state of the tree, you could be stuck. With a DVCS, even if the project lead first commits total garbage over the tree state and then commits a removal of all files, you can roll back your own repository to before the damage. Even if the master repository is cleaned out by force, there is no way for the project lead to reach into your repository copy and destroy things.

(Well, in theory. Depending on the specific DVCS, it might be possible to do things like rewrite tags and branch labels, so that while you had the raw data it'd be hard for you to find it.)

All of this is because the entire design of modern DVCSes is about never rewriting or removing things that already exist. There is no way to retract or overwrite things, and while this is sometimes inconvenient and problematic, it does give you a fairly strong immunity from the project lead changing their mind or going crazy.

As an aside, this leads me to feel that the really important thing is thus not the project's source repository but the project's communications infrastructure; its website, its forums, its mailing lists. If all of those were shut down abruptly, sure, you could put up a new master source repository but how would you tell people about where to find it and then get back in touch with the developers? The mailing list you'd use to do so is also shut down, and you probably don't have the subscriber list to put together a new version.

(If you've kept a local archive of the project mailing lists, you can at least assemble the email addresses of recent or frequent posters to the mailing list; this would help to get the word out. Otherwise, well, you'd have to start spreading the word however you can, although I suppose it's common practice to have usable email addresses in DVCS commits so you could mine that to get developer email addresses.)

(I doubt I'm the first person to notice this.)

Written on 03 April 2010.
« Notes on the compatibility of crypted passwords on various Unixes
The problem with header and footer overlays on web pages »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Apr 3 02:29:28 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.