Yes, git add makes a difference (no matter what people think)

May 28, 2012

One of the things said about git is that it's less user friendly and takes longer to learn than Mercurial; the first exhibit for this difference is usually git add and by extension git's index. Unfortunately, a common reaction among git fans to both the general issue and git add in specific is a kind of defensive denial, where they hold forth that it's not that difficult and people learn it fine and really, git is user friendly.

You may already have gotten an idea of my views on this. I'm here to tell you, from a mostly outsider perspective, that git add really does make a real difference in initial user friendliness, one that makes Mercurial easier to pick up and use for straightforward uses.

(I've used git to a certain extent, for example for my Github stuff, but I am not up to the experienced user level. I'm not really at that level with Mercurial either, partly because I haven't needed to be and partly because I'd rather learn git; Mercurial is easier but I like git more.)

Before people freak out too much, let me be explicit: all of this is about initial user friendliness, the ease of doing straightforward things and picking up the system. In the long run I think that the git index is the right design decision (for a programmer focused VCS) because it creates an explicit model for doing a number of important but tricky things, a model that can be manipulated and inspected and reasoned about, and once you learn git and use it regularly dealing with the index becomes second nature. But people generally do not defend the index in these terms; instead, they try to maintain with a straight face that it's no real problem for people even at the start.

(If you think that the index does not cause problems for git beginners, I would gently suggest that you trawl through some places where they ask questions.)

The usability problem with git add is not just the need for git add itself as an extra step, it is that the existence of the index has additional consequences that ripple through to using other bits of git. For example, let us take the case of the disappearing diff:

; git diff a
[...]
-hi there
+hi there, jim
; git add a
; git diff
;

If you already know git you know what's going on here (and you're going to reach for 'git diff --cached'). If you're learning git, well, your change just disappeared. Of course this happens the other way around too; 'git diff' shows you nice diffs, then you do 'git commit' and it tells you nothing to commit. Wait, what? The diffs are right there.

(There's worse bear traps in the woods for beginners, too, like doing a 'git add' and then further editing the file. Here 'git diff' will show you a diff but it is not what will be committed.)

All of this is a cognitive burden. When you use git, you have to learn and remember the existence of the index and how this affects what you do, and you probably need to take extra steps or pay extra attention to what 'git commit' and so on tell you. This cognitive burden is real, although it can (and will be) overcome with familiarity and what it enables has important benefits. It is a mistake and a lie to try to pretend otherwise. Honesty in git advocacy is to say straightforwardly that the index is worth it in the end (possibly unless you have simple work patterns).

(A system where the index or its equivalent is an advanced feature, one not exposed by default, really does have a simpler initial workflow. If it's designed competently (and Mercurial is), everything 'just works' the way you expect; hg commit commits what hg diff shows you and so on. In real life this makes a difference to people's initial acceptance of a new VCS, especially if the simple workflow is adequate for almost everything you'll ever do with the system. This is not true of the sort of advanced VCS use that programmers can practice routinely, but it can be of other VCS uses.)

Sidebar: the problem with 'git commit -a'

At this point some people may come out of the woodwork to tell me about git commit -a, or even about creating an alias like 'git checkin' that always forces -a. There are two pragmatic problems with this.

First, the index still exists even if you're trying to pretend otherwise. This means that you can accidentally use the index; you can run git add because something said to, or you can run straight git commit, and so on. All of these will create confusion and cause git to do what (to you) looks like the wrong thing.

(In fact you have to run git add every so often, to add new files.)

Second, it is not at all obvious from simply reading documentation that using git commit -a is a fully reliable way of transmuting git into Mercurial. Maybe it is, maybe it isn't, but as a beginner you don't know (not without doing more research than I myself have done). Because many git operations are fundamentally built around the existence of the index, the safest assumption to make is that the index really does matter and git commit -a is probably an incomplete workaround.

(For example, at the point where you do git add to add a new file you'll become familiar with git diff HEAD in order to get the true diffs for what will be committed when you run git commit -a, which I hope illustrates my point adequately. And maybe there's a better command for doing that, which also illustrates my point because git diff HEAD is what I came up with as a relative git novice.)

Written on 28 May 2012.
« How to do a very cautious LVM storage migration
What it means for an OS to succeed or fail »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon May 28 23:34:04 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.