Using a single git repo to compare things between two upstreams

February 6, 2019

The other day I wrote about hand-building an updated upstream kernel module. One of the things that I wanted to do in that is to compare the code of the nct6775 module I wanted to build between the 4.20.x branch in the stable tree and the hwmon-next branch in Guenter Roeck's tree. In my entry, I did this by cloning each Git repo separately and then running diff by hand, but this is a little awkward and I said that there was probably a way to do this in a single Git repo. Today I have worked out how to do that, and so I'm going to write it down.

To do this we need a single Git repo with both trees present in it, which means that both upstream repos need to be remotes. We can set up one as a remote simply by cloning from it:

git clone [...]/groeck/linux-staging.git

(I've chosen to start with the repo I'm theoretically going to be building from, instead of the repo I'm only using to diff against.)

Then we need to add the second repo as a remote, and fetch it:

cd linux-staging
git remote add stable [...]/stable/linux.git
git fetch stable

At this point 'git branch -r' will show us that we have all of the branches from both sides. With the data from both upstreams in our local repo and a full set of branches, we can do the full form of the diff:

git diff stable/linux-4.20.y..origin/hwmon-next drivers/hwmon/nct6775.c

We can make this more convenient by shortening one or both names, like so:

git checkout linux-4.20.y
git checkout hwmon-next

git diff linux-4.20.y.. drivers/hwmon/nct6775.c

I'm using 'git checkout' here partly as a convenient way to run 'git branch' with the right magic set of options:

git branch --track linux-4.20.y stable/linux-4.20.y

Actually checking out hwmon-next means we don't have to name it explicitly.

We can also diff against tags from the stable repo, and we get to do it without needing to say which upstream the tags are from:

git diff v4.20.6.. drivers/hwmon/nct6775.c
git diff v4.19.15.. drivers/hwmon/nct6775.c

The one drawback I know of to a multi-headed repo like this is that I'm not sure how you get rid of an upstream that you don't want any more. At one level you can just delete the remote, but that leaves various things cluttering up your repo, including both branches and tags. Presumably there is a way in Git to clean those up and then let Git's garbage collection eventually delete the actual Git objects involved and reclaim the storage.

(One can do more involved magic by not configuring the second repo as a remote and using 'git fetch' directly with its URL, but I'm not sure how to make the branch handling work nicely and so on. Setting it up as a full remote makes all of that work, although it also pulls in all tags unless you use '--no-tags' and understand what you're doing here, which I don't.)

Looking back, all of this is relatively basic and straightforward and I think I knew most of the individual bits and pieces involved. But I'm not yet familiar and experienced enough with git to confidently put them all together on the fly when my real focus is doing something else.

(Git is one of those things that I feel I should be more familiar with than I actually am, so every so often I take a run at learning how to do another thing in it.)


Comments on this page:

By dozzie at 2019-02-07 05:42:44:

The one drawback I know of to a multi-headed repo like this is that I'm not sure how you get rid of an upstream that you don't want any more.

I would delete the remote, delete its remote tracking branches (refs/remotes/<name>/*), and if you're worried about excessive tags, you can delete them all and run git fetch --tags on your remaining remotes again. At least I would do it this way.

There's also remote.<name>.tagOpt if you expect the remote to go away at some point.

You mentioned doing it directly with fetch; here's how you can do that:

      git ls-remote $some-git-url
      git fetch $some-git-url $remote-branch
      git branch branch-a FETCH_HEAD

      git ls-remote $other-git-url
      git fetch $other-git-url $remote-branch2
      git branch branch-b FETCH_HEAD

      git diff branch-a..branch-b $whatever-file

I like this flow myself because the repo we have at work has thousands of branches, so I have my remote configured to only ever pull master, and I use the above commands to find and fetch specific branches I need to look at.

Written on 06 February 2019.
« A problem with strict memory overcommit in practice
A touchpad is not a mouse, or at least not a good one »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Feb 6 23:10:50 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.