Being reminded that Git commits are separate from Git trees

April 30, 2025

Firefox's official source repository has moved to Git, but to a completely new Git repository, not the Git mirror that I've used for the past few years. This led me to a lament on the Fediverse:

This is my sad face that Firefox's switch to using git of course has completely different commit IDs than the old not-official gecko-dev git repository, meaning that I get to re-clone everything from scratch (all ~8 GB of it). Oh well, so it goes in the land of commit hashes.

Then Tim Chase pointed out something that I should have thought of:

If you add the new repo as a secondary remote in your existing one and pull from it, would it mitigate pulling all the blobs (which likely remain the same), limiting your transfer to just the commit-objects (and possibly some treeish items and tags)?

Git is famously a form of content-addressed storage, or more specifically a tree of content addressed storage, where as much as possible is kept the same over time. This includes all the portions of the actual source tree. A Git commit doesn't directly include a source tree; instead it just has the hash of the source tree (well, its top level, cf).

What this means is that if you completely change the commits so that all of them have new hashes, for example by rebuilding your history from scratch in a new version of the repository, but you keep the actual tree contents the same in most or all of the commits, the only thing that actually changes is the commits. If you add this new repository (with its new commit history) as a Git remote to your existing repository and pull from it, most or all of the tree contents are the same across the two sets of commits and won't have to be fetched. So you don't fetch gigabytes of tree contents, you only fetch megabytes (one hopes) of commits.

As I mentioned on the Fediverse, I was told this too late to save me from re-fetching the entire new Firefox repository from scratch on my office desktop (which has lots of bandwidth). I may yet try this on my home desktop, or alternately use it on my office desktop to easily move my local changes on top of the new official Git history.

(I think this is effectively rebasing my own changes on top of something that's been rebased, which I've done before, although not recently. I'll also want to refresh my understanding of what 'git rebase' does.)

Written on 30 April 2025.
« The appeal of keyboard launchers for (Unix) desktops
The complexity of mixing mesh networking and routes to subnets »

Page tools: View Source.
Search:
Login: Password:

Last modified: Wed Apr 30 22:28:20 2025
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.