Some git repository manipulations that I don't know how to do well

September 30, 2016

For a while now, I've been both tracking upstream git repositories and carrying and rebasing local changes on top of some of them in a straightforward way. The rebasing has been pretty easy and a clear win. But recently I've run into a number of more complicated cases where I'm not sure that I'm using git in the best way. So here are a collection of problems that I'm semi-solving.

I keep local tracking copies of a few upstream repos that are rebased periodically, such as this one. I have no local changes in such tracking repositories (ie my local master is always the same as origin/master). When the upstream rebases my plain git pull will fail, either telling me that it can't be fast-forwarded or demanding that I merge things. In the past I've just deleted the entire repo and re-cloned as the simple way out. Recently I've constructed the manual fix:

git pull --ff-only
[fails, but it's fetched stuff]
git checkout -f -B master origin/master

It would be nice to have the repo set up so that a plain 'git pull' would do this, ideally only if it's safe. I could script this but there are only some repos that I want to do this for; for others, either I have local changes or this should never happen.

(The git pull manpage has some interesting wording that makes it sound like asking for a rebase here will maybe do the right thing. But I just found that and contriving a test case is not trivial. Or maybe I want 'git pull -f'. And now that I've done some web searches, apparently I want 'git reset --hard origin/master'. Git is like Perl; sometimes it has so many different paths to the same result.)

Next, I'm tracking the repo for my favorite RAW development software and have some little local fixes added on top. Normally I build from the latest git tip, but that's sort of in flux right now so I want to switch to the release branch but still add my changes on top. I'm doing this like so:

git checkout darktable-2.0.x
git cherry-pick origin/master..master

I think that using git cherry-pick here instead of some form of git rebase is probably the correct approach and this is the one case where I'm doing things right.

I'm tracking the main repository for my shell and applying my changes on top of it. However, there is a repository of interesting changes that I want to try out; of course I still want my local changes on top of this. When I did this I think what I did was 'git pull --rebase /the/other/repo', but I believe that probably wasn't the right approach. I suspect that what I really wanted to do was add the second repo as an alternate upstream, switch to it, and either cherry-pick or rebase my own changes on top.

Except, well, looking back at my notes about working with Github PRs it strikes me that this is basically the same situation (except this repo isn't an explicit PR). I should probably follow the same process instead of hacking my way around the way I did this time.

Finally, we combine the two situations: I'm building on top of the repo of interesting changes and it rebases itself. Now I want to replace the old version with the new version but reapply my changes on top. I'm not going to try to write down the process I used this time, because I'm convinced it's not optimal; I basically combined the reset origin plus cherry-pick process, using explicit commit hashes for the cherry-picking and recording them beforehand. Based on Aristotle Pagaltzis's comment on my Github PR entry, I think I want something from this Stackoverflow Q&A but I need to read the answers carefully and try it out before I'm sure I understand it all. It does look relatively simple.

(This writeup of git rebase --onto is helpful too, but again I need to digest it. Probably I need to redo this whole procedure the right way the next time this comes up, starting from a properly constructed repo.)

Sidebar: How I test this stuff out

I always, always test anything like this on a copy of my repo, not on the main version. I make these copies with rsync, because making repo copies with 'git clone' changes how git sees things like upstreams (for obvious reasons). I suspect that this is the correct approach and there is no magic 'git clone' option that does what I want here.


Comments on this page:

For the first issue, you can find in git-fetch(1):

The remote ref that matches src is fetched, and if dst is not empty string, the local ref that matches it is fast-forwarded using src. If the optional plus + is used, the local ref is updated even if it does not result in a fast-forward update.

So something like

  fetch = +refs/heads/*:refs/remotes/munnich/*

should just overwrite the master branch. (And yes, the git reset is how I'd do it manually...)

Finally, git clone --mirror should make a 1:1 copy, but it also forces a bare repo which probably is not what you want...

By cks at 2016-10-01 19:37:44:

All my repos seem to already be set up with the optional + on their 'fetch =' bit. I think the issue is that this just updates the local ref, ie origin/master, but doesn't update my local branch (plain master) or check out the updated branch.

Christian:

Well… that fetch pattern is already the default configuration for any remotes you add. Did I misunderstand what you’re talking about, or did you misunderstand Chris? My understanding of what Chris said is that he wants his local master to always be the same the as the remote master. You seem to be talking about remote tracking branches. The default for those is already to always reflect the remote, regardless of whether an update was a fast-forward or not.

Chris:

Yes, git reset --hard origin/master would be the “obvious” way of going about that. Though your git checkout -f -B master origin/master has almost exactly the same effect, now that I look at it. (I never thought of doing it that way. Probably because the -B switch didn’t exist for a long time and git reset was the only way of doing that.) The only difference is that it will (or might) re-setup the branch on every reset, though most of the time that is a no-op.

Note that if origin/master is set as the upstream branch for master, you can refer to it as @{u}, i.e. git reset --hard @{u}, which is shorter and generalises to any branch.

Regarding case 2 – hey, you taught me something. 😊 I never thought of using git cherry-pick for anything. My approach would have been something along the lines of

git branch tmp master
git rebase -onto darktable-2.0.x origin/master tmp
git checkout -B darktable-2.0.x
git branch -D tmp

which is way more cumbersome, owing to the difference between rebasing and cherry-picking – that rebasing tries to manage the head of your branch for you. This is a case where you don’t want that.

For the case of two upstream repositories for rc – I’m not sure I understand exactly what you want to achieve. Did you just want to temporarily switch over to @muennich’s fork completely…? You didn’t add a remote for it, so I assume the answer is no… but then it seems to me the same cherry-picking approach would also work? What is your goal here, and what your particular confusion/problem?

As for the last case… yes, my StackOverflow answer covers exactly that scenario. Basically, before you fetch (because fetching force-updates branches by default, as mentioned), make a throw-away branch to record the state of the remote branch you care about, so you can use the three-argument for of git rebase --onto without scanning logs and copy-pasting commit hashes… and then do that.

By cks at 2016-10-02 17:18:49:

Aristotle: for the two upstream repositories for rc case, what I wanted to do was wind up with a repo that at least temporarily switched over to @muennich's fork. My actual approach was a hack, but I think the correct way is to add it as a remote and a branch, something like:

git remote add muennich https://github.com/muennich/rc
git pull muennich
git checkout -b muennich muennich/master

Then I'd either cherry-pick or rebase my local commits from master to the muennich branch. A cherry-pick would be just the same as the darktable case; a rebase would be following your example for the darktable case:

git rebase --onto muennich origin/master master

(I tested it and this appears to work. I'm not clear why you created a tmp branch then deleted it, so maybe I'm missing something.)

I may have misunderstood, but my understanding of your situation was that you have some local commits on top of origin/master, and now you wanted the same commits copied to darktable-2.0.x on top of origin/darktable-2.0.x – without removing them from your master.

In that case, your git cherry-pick approach precisely expresses the outcome you want.

I would not have thought to do it that way myself, and would have reached for git rebase instead. In that case the tmp branch is necessary because git rebase is the wrong tool for the job. Namely if you just say

git rebase -onto darktable-2.0.x origin/master master

then that will not just replay your local commits onto darktable-2.0.x, it will also update master to the tip of the newly created commit series, making master branch off of darktable-2.0.x instead of origin/master. That’s not what you wanted. But since git rebase insists on updating a branch, you need a sacrificial branch to offer it. And since it updates that branch, you have fast-forward your darktable-2.0.x branch to it yourself.

So my way of achieving what you wanted would have been rather roundabout, whereas yours was just right.

Now as for muennich/rc – you say you wanted to switch over, in which case that git rebase command exactly expresses that. (For the same reason – it re-points your master to the newly created commits.)

That tmp branch is only necessary if you don’t want to switch over, but rather want to have a copy of your local commits on both branches, and you’re me and don’t realise that git cherry-pick will give you that directly.

So we’ve gone over all your scenarios, and you should be all covered now.

Some minor notes on that last one: you don’t necessarily need a local muennich branch at all, you can just say

git rebase --onto muennich/master origin/master master

And you should probably follow up the rebase with

git branch -u muennich/master master

which makes muennich/master your master’s new upstream. Then git pull --rebase will Just Work for example. (You can always change it back in the same way if you decide to switch back.)

By cks at 2016-10-03 09:38:28:

Ah! I somehow missed that the git rebase updated where master pointed, not my muennich branch. So either I should change the upstream of master afterwards or I should use a temporary branch. For what I want to do, probably using a temporary branch would be the right answer with git rebase. But probably the really right answer here is to cherry-pick instead, since that's simpler.

I guess this makes sense for one view of what rebase is doing, namely moving commits instead of just copying them. What I usually want to do here is probably more copying commits instead of moving them, since I wouldn't mind having my old 'based on the master' version around too.

(Copying commits is definitely what I want in the darktable case, since at some point I'll go back to the development version when it settles down.)

Written on 30 September 2016.
« In search of modest scale structured syslog analysis
The MyDoom worm is still out there »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Sep 30 22:09:24 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.