Wandering Thoughts archives

2017-05-28

My thoughts on git worktrees for me (and some notes on things I tried)

I recently discovered git worktrees and did some experimentation with using them for stuff that I do. The short summary of my experience so far is that while I can see the appeal for certain sorts of usage cases, I don't think git worktrees are a good fit for my situation and I'm probably to use completely independent repositories in the future.

My usage case was building my own copies of multiple versions of some project, starting with Go. Especially in the case of a language compiler and its standard library, it's reasonably useful to have the latest development version plus a stable version or two; for example, it gives me an easy way to test if something I'm working on will build on older released versions or if I've let a dependency on some recent bit of the standard library creep in. The initial process of creating a worktree for, say, Go 1.8 is reasonably straightforward:

cd /some/where/go
git worktree add -b release-branch.go1.8 ../v1.8 origin/release-branch.go1.8

What proved tricky for me is updating this v1.8 tree when the Go people update Go 1.8, as they do periodically. My normal way of staying up to date on what changes are happening in the main line of Go is to do 'git pull' in my master repo directory, note the lines that get printed out about fetched updates, eg:

remote: Finding sources: 100% (64/64)
remote: Total 64 (delta 23), reused 64 (delta 23)
Unpacking objects: 100% (64/64), done.
From https://go.googlesource.com/go
   ffab6ab877..d64c49098c  master     -> origin/master

And then I use 'git log ffab6ab877..d64c49098c' to see what changed. The problem with worktrees is that this information is printed by 'git fetch', and normally 'git fetch' updates all branches, both the mainline and, say, a release branch you're following. So I actively don't want to run 'git pull' or 'git fetch' in the worktree directory, because otherwise I will have to remember to stop and look at the mainline updates it's just fetched and reported to me.

What I wound up doing was running 'git pull' in my main go tree and if there was an update to origin/release-branch.go1.8 reported, I'd go to my 'v1.8' directory and do 'git merge --ff-only'. This mostly worked (it blew up on me once for reasons I don't understand), but it means that dealing with a worktree is different than dealing with a normal Git repo directory (including an independently cloned repo). Since 'git pull' and other Git commands work 'normally' in a worktree, I have to explicitly remember that I created something as a worktree (or check to see if .git is a directory to know, since 'git status' doesn't helpfully tell you one way or the other).

(In my current moderate level of Git knowledge and experience, I'm going to avoid writing about the good usage cases I think I see for worktrees. Anyway, one of them is documented in the git-worktree manpage; I note that their scenario uses a worktree for a one-shot branch that's never updated from upstream.)

As mentioned, if I want to see if a particular Git repo is a worktree or not I need to do 'ls -ld .git'. If it's a file, I have a worktree. If I have a directory, with how I currently use Git, it's a full repo. 'git worktree list' will list the main repo and worktrees, but it doesn't annotate things with a 'you are here' marker. Obviously if I used worktrees enough I could write a status command to tell me, but then if I was doing that I could probably write a bunch of commands to do what I want in general.

Sidebar: Excessively clever Git configuration hacking (maybe)

Bearing in mind that I don't understand Git as much as I think I may, as far as I can see what branches 'git fetch' fetches are determined from the configuration for the remote for a branch, not from the branch's configuration. There appear to be two options for fiddling things here.

The 'obvious' option is to create a second remote (call it, say, 'v1.8-origin') with the same url as origin but a fetch setting that only fetches the particular branch:

fetch = refs/heads/release-branch.go1.8:refs/remotes/origin/release-branch.go1.8

Then I'd switch the remote for the release-branch.go1.8 branch to this new remote.

Git-fetch also has a feature where you can have a per-branch configuration in $GIT_DIR/branches/<branch>; this can be used to name the upstream 'head' (branch) that will be fetched into the local branch. It appears that creating such a file should do the trick, but I can't find people writing about this on the Internet (just many copies of the git-fetch manpage), so I'm wary of assuming that I understand what's going to happen here. Plus, it's apparently a deprecated legacy approach.

(If I understand all of this correctly, either approach would preserve 'git pull' in the main repo (which is on the master branch) always fetching all branches from upstream.)

programming/GitWorktreeThoughts written at 23:08:19; Add Comment

Specifications are ultimately defined by their implementations

In theory, the literal text of a specification is the final authority on defining what the specification means and requires. In practice, it generally doesn't work out this way; once a specification gets adopted, it ultimately becomes defined by its implementations. Regardless of what the actual text says, if everyone, or most people, or just dominant implementations do something or have some (mis-)interpretation of the specification, those things become the specification in practice. If your implementation doesn't conform to the wrong things that other implementations do, you can expect to have problems interoperating with those other implementations, and they almost always have more practical power than you do. You can appeal to the specification all you want, but it's not going to get you anywhere. People actually using the implementations generally care most that they interoperate, and they don't really care about why they do or don't. A new implementation that refuses to interoperate may or may not be 'correct' by the specification (many people are not well placed to know for sure), but it certainly isn't very useful to most people and it's not likely to get many users in the absence of other factors.

(Of course there can always be other factors. It's sometimes possible to give people no choice about using a particular (new) implementation or very strongly tilt them towards it, and if you do this with a big enough pool of people, your new implementation can rapidly become a dominant one. The browser wars in the late 90s are one example of this effect in action, as are browser engines on mobile platforms today.)

One corollary of this is that it's quite important to write a clear and good specification. Such a specification maximizes the chances that all implementations will do the same thing and that what they do will match what you wrote. Conversely, the more confusing and awkward the specification, the more initial chaos there will be in implementations and the more random and divergent from your intentions the eventual actual in-practice 'standard' is likely to be.

(If your specification is successful, enough of the various people involved will wind up implementing some common behavior so they can interoperate. This behavior does not necessarily have much relationship to what you intended; instead it's likely to be based on some combination of common misunderstandings, early implementations that set the stage for everyone else to copy, and what people settled on as the most useful way to behave.)

(I've sort of written about this before in the context of programming language specifications.)

tech/SpecsEndUpDefinedByImplementations written at 00:18:02; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.