Some notes to myself on 'git log -G' (and sort of on -S)

December 29, 2022

Today I found myself nerd-sniped by a bit in Golang is evil on shitty networks (via), and wanted to know where a particular behavior was added in Go's network code. The article conveniently identified the code involved, so once I found the source file all I theoretically needed to do was trace it back in history. Until recently, my normal tool for this is Git's 'blame' view and mode, often on Github because Github has a convenient 'view git-blame just before this commit', which makes it easy to step back in history. Unfortunately in this case, the source code had been reorganized and moved around repeatedly, so this wasn't easy.

Instead, I turned to ''git log -G', which I'd recently used to answer a similar question in the ZFS code base. 'git log -G <thing>' and the somewhat similar 'git log -S <thing>' search for '<thing>' in commits (in different ways). This time around, I used plain 'git log -G <thing>' and then got the full details of likely commits with 'git show <id>'. A generally better option is 'git log -G <thing> -p', which includes the diff for a commit but by default shows you only the file (or files) where the thing shows up in the changes (per gitdiffcore's pickaxe section).

(In this case I had to iterate git-log a few times, because the implementation changed. The answer turns out to be it's been there since the beginning, which I could have found out by reading the other comments.)

'Git log -G' is not exactly the fastest thing in the world, which isn't surprising since it has to generate diffs for all changesets in order to look at things. It's single-threaded, unsurprisingly, and generally CPU bound in my testing. This implies that if I'm probing for multiple things at once on a SMP machine (which is the usual case), it's to my benefit to run multiple git-logs at once in different windows. On sufficiently large repositories it's probably also disk IO bound, although that will depend a lot on the storage involved. Because of this, it seems that it be useful to trim down what file paths Git considers, if you good confidence of where relevant files both are now and were in the past.

'Git log -S' is subtly different from 'git log -G' in a way that may make it less useful than you expect, depending on the repository. As covered in the git-log manual page, the -S option specifically includes binary files as well, while -G implicitly excludes them because they don't normally create patch text (binary files can be included with --text if you really want). In this case the identifier name I was looking for also appeared in some binary files of test data, so some of the commits reported by 'git log -S' puzzled me until I realized that they were being included because they added new versions of the test data, which meant that the number of instances of the identifier name had gone up.

(There may be some way to make -S not search binary files, but if so I couldn't find it when looking through the git-log manual page.)

If I'm hunting for when something was introduced or removed and I'm sure that the repository has no binary files to confuse me, using 'git log -p -S' is probably safe. If there are binary files around to annoy me, I'm probably pragmatically better off using 'git log -p -G'. Using -G also means that I'll spot changes in how something is used, for example making a function call conditional or not conditional (which I believe won't normally show up in -S). Probably my life is better if I standardize on first using 'git log -G' and then switching to -S if I'm getting too many code motion commits.


Comments on this page:

From 193.219.181.219 at 2022-12-30 08:17:48:

Until recently, my normal tool for this is Git's 'blame' view and mode, often on Github because Github has a convenient 'view git-blame just before this commit', which makes it easy to step back in history

For this I prefer tig blame [commit] path.c, which has the <,> key to step backwards to the selected line's parent. (Unfortunately doesn't keep a history so you cannot travel forward again, like you could in a web browser.)

Written on 29 December 2022.
« Some practical notes on the systemd cgroups/units hierarchies
Disabling automatic form autofilling in Firefox (which is now simple) »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Dec 29 23:14:11 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.