My workflow for testing Github pull requests

July 29, 2015

Every so often a Github-based project I'm following has a pending pull request that might solve a bug or otherwise deal with something I care about, and it needs some testing by people like me. The simple case is when I am not carrying any local changes; it is adequately covered by part of Github's Checking out pull requests locally (skip to the bit where they talk about 'git fetch'). A more elaborate version is:

git fetch origin pull/<ID>/head:origin/pr/<ID>
git checkout pr/<ID>

That creates a proper remote branch and then a local branch that tracks it, so I can add any local changes to the PR that I turn out to need and then keep track of them relative to the upstream pull request. If the upstream PR is rebased, well, I assume I get to delete my remote and then re-fetch it and probably do other magic. I'll cross that bridge when I reach it.

The not so simple case is when I am carrying local changes on top of the upstream master. In the fully elaborate case I actually have two repos, the first being a pure upstream tracker and the second being a 'build' repo that pulls from the first repo and carries my local changes. I need to apply some of my local changes on top of the pull request while skipping others (in this case, because some of them are workarounds for the problem the pull request is supposed to solve), and I want to do all of this work on a branch so that I can cleanly revert back to 'all of my changes on top of the real upstream master'.

The workflow I've cobbled together for this is:

  • Add the Github master repo if I haven't already done so:
    git remote add github https://github.com/zfsonlinux/zfs.git

  • Edit .git/config to add a new 'fetch =' line so that we can also fetch pull requests from the github remote, where they will get mapped to the remote branches github/pr/NNN. This will look like:
    [remote "github"]
       fetch = +refs/pull/*/head:refs/remotes/github/pr/*
       [...]

    (This comes from here.)

  • Pull down all of the pull requests with 'git fetch github'.

    I think an alternate to configuring and fetching all pull requests is the limited version I did in the simple case (changing origin to github in both occurrences), but I haven't tested this. At the point that I have to do this complicated dance I'm in a 'swatting things with a hammer' mode, so pulling down all PRs seems perfectly fine. I may regret this later.

  • Create a branch from master that will be where I build and test the pull request (plus my local changes):
    git checkout -b pr-NNN

    It's vitally important that this branch start from master and thus already contain my local changes.

  • Do an interactive rebase relative to the upstream pull request:
    git rebase -i github/pr/NNN

    This incorporates the pull request's changes 'below' my local changes to master, and with -i I can drop conflicting or unneeded local changes. Effectively it is much like what happens when you do a regular 'git pull --rebase' on master; the changes in github/pr/NNN are being treated as upstream changes and we're rebasing my local changes on top of them.

  • Set the upstream of the pr-NNN branch to the actual Github pull request branch:
    git branch -u github/pr/NNN

    This makes 'git status' report things like 'Your branch is ahead of ... by X commits', where X is the number of local commits I've added.

If the pull request is refreshed, my current guess is that I will have to fully discard my local pr-NNN branch and restart from fetching the new PR and branching off master. I'll undoubtedly find out at some point.

Initially I thought I should be able to use a sufficiently clever invocation of 'git rebase' to copy some of my local commits from master on to a new branch that was based on the Github pull request. With work I could get the rebasing to work right; however, it always wound up with me on (and changing) the master branch, which is not what I wanted. Based on this very helpful page on what 'git rebase' is really doing, what I want is apparently impossible without explicitly making a new branch first (and that new branch must already include my local changes so they're what gets rebased, which is why we have to branch from master).

This is probably not the optimal way to do this, but having hacked my way through today's git adventure game I'm going to stop now. Feel free to tell me how to improve this in comments.

(This is the kind of thing I write down partly to understand it and partly because I would hate to have to derive it again, and I'm sure I'll need it in the future.)

Sidebar: Why I use two repos in the elaborate case

In the complex case I want to both monitor changes in the Github master repo and have strong control over what I incorporate into my builds. My approach is to routinely do 'git pull' in the pure tracking repo and read 'git log' for new changes. When it's time to actually build, I 'git pull' (with rebasing) from the tracking repo into the build repo and then proceed. Since I'm pulling from the tracking repo, not the upstream, I know exactly what changes I'm going to get in my build repo and I'll never be surprised by a just-added upstream change.

In theory I'm sure I could do this in a single repo with various tricks, but doing it in two repos is much easier for me to keep straight and reliable.

Written on 29 July 2015.
« A cynical view on needing SSDs in all your machines in the future
Ubuntu once again fails at a good kernel security update announcement »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Jul 29 23:08:58 2015
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.