Wandering Thoughts archives

2017-05-31

Why one git fetch default configuration bit is probably okay

I've recently been reading the git fetch manpage reasonably carefully as part of trying to understand what I'm doing with limited fetches. If you do this, you'll run across an interesting piece of information about the <refspec> argument, including in its form as the fetch = setting for remotes. The basic syntax is '<src>:<dst>', and the standard version that is created by any git clone gives you:

fetch = +refs/heads/*:refs/remotes/origin/*

You might wonder about that + at the start, and I certainly did. Well, it's special magic. To quote the documentation:

The remote ref that matches <src> is fetched, and if <dst> is not empty string, the local ref that matches it is fast-forwarded using <src>. If the optional plus + is used, the local ref is updated even if it does not result in a fast-forward update.

(Emphasis mine.)

When I read this my eyebrows went up, because it sounded dangerous. There's certainly lots of complicated processes around 'git pull' if it detects that it can't fast-forward what it's just fetched, so allowing non-fast-forward fetches (and by default) certainly sounded like maybe it was something I wanted to turn off. So I tried to think carefully about what's going on here, and as a result I now believe that this configuration is mostly harmless and probably what you want.

The big thing is that this is not about what happens with your local branch, eg master or rel-1.8. This is about your repo's copy of the remote branch, for example origin/master or origin/rel-1.8. And it is not even about the branch, because branches are really 'refs', symbolic references to specific commits. git fetch maintains refs (here under refs/remotes/origin) for every branch that you're copying from the remote, and one of the things that it does when you fetch updates is update these refs. This lets the rest of Git use them and do things like merge or fast-forward remote updates into your local remote-tracking branch.

So git fetch's documentation is talking about what it does to these remote-branch refs if the branch on the remote has been rebased or rewound so that it is no longer a more recent version of what you have from your last update of the remote. With the + included in the <refspec>, git fetch always updates your repo's ref for the remote branch to match whatever the remote has; basically it overwrites whatever ref you used to have with the new ref from the remote. After a fetch, your origin/master or origin/rel-1.8 will always be the same as the remote's, even if the remote rebased, rewound, or did other weird things. You can then go on to fix up your local branch in a variety of ways.

(To be technical your origin/master will be the same as origin's master, but you get the idea here.)

This makes the + a reasonable default, because it means that 'git fetch' will reliably mirror even a remote that is rebasing and otherwise having its history rewritten and its branches changed around. Without the +, 'git fetch' might transfer the new and revised commits and trees from your remote but it wouldn't give you any convenient reference for them for you to look at them, integrate them, or just reset your local remote-tracking branch to their new state.

(Without the '+', 'git fetch' won't update your repo's remote-branch refs. I don't know if it writes the new ref information anywhere, perhaps to .git/FETCH_HEAD, or if it just throws it away, possibly after printing out commit hashes.)

Sidebar: When I can imagine not using a '+'

The one thing that using a '+' does is that it sort of allows a remote to effectively delete past history out of your local repo, something that's not normally possible in a DVCS and potentially not desirable. It doesn't do this directly, but it starts an indirect process of it and it certainly makes the old history somewhat annoying to get at.

Git doesn't let a remote directly delete commits, trees, and objects. But unreferenced items in your repo are slowly garbage-collected after a while and when you update your remote-branch refs after a non-ff fetch, the old commits that the pre-fetch refs pointed to start becoming more and more unreachable. I believe they live on in the reflog for a while, but you have to know that they're missing and to look.

If you want to be absolutely sure that you notice any funny business going on in an upstream remote that is not supposed to modify its public history this way, not using '+' will probably help. I'm not sure if it's the easiest way to do this, though, because I don't know what 'git fetch' does when it detects a non-ff fetch like this.

(Hopefully git fetch complains loudly instead of failing silently.)

GitFetchMagicPlus written at 00:44:13; Add Comment

By day for May 2017: 10 12 28 29 31; before May; after May.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.