Wandering Thoughts archives

2017-08-31

With git, it's useful to pick the right approach to your problem

One of the things about Git is that once you go past basic committing, it generally has any number of ways for you to do what you want done. As with programming languages, part of getting better at Git is learning to pick the right idiom to attack your problem with. I can't claim that I'm good at this, but I am getting more experience, and recently I had an interesting experience here.

I use Byron Rakitzis' version of rc as my shell, which these days can be found on Github. Well, as has come up before, I don't actually use (just) that official version. I have my own change (to add a built-in read command) and because it was there, I also use some completion improvements, which come from Bert Münnich's collection of interesting rc modifications.

(I originally tried out Bert Münnich's version to get an important improvement before it became an official change. This change also shows the awesome power of raising an issue even if you expect that it's hopeless, as well as exploring Github forks of a project you're interested in.)

Recently some improvements have landed in the main repo that Bert Münnich has not yet rebased his modifications on top of. The other day I decided that I should update my own version to pick up these changes, because it turned out that I wanted to rebuild it on Fedora 26 anyway (that's its own story). The obvious way to do this was a straight rebase on top of the main repo, so that's what I did first.

The end result worked, but getting there took a bunch of effort. Bert Münnich's modifications include things like changing the build system from GNU Autoconf to a simple Makefile based one, so there were a bunch of clashes and situations that I had to resolve by hand (and I wasn't entirely confident of my own changes, since I was modifying Münnich's modifications without fully understanding either them or the upstream changes they clashed with). It felt like I was both working too hard and creating a fragile edifice for myself, so at the end I took a step back and asked what I really wanted and if there was a simpler, better way to get it.

When I did this I realized that what I really wanted was the upstream with my addition plus only a few of Bert Münnich's modifications (I've become addicted to command completion). While I could have created this with more rebasing, there was a much simpler approach (partly enabled by a better understanding of Git remotes and so on):

  1. Create a new clone of the main repo.
  2. Add Münnich's repo and my previous everything-together local repo as additional remotes.
  3. git fetch from both new remotes in order to make all of their commits available locally.

  4. git cherry-pick my addition by its commit hash.

  5. git cherry-pick the modifications I wanted from Münnich's repo, which only amounted to a few of them. Again I did this commit by commit using the commit hash, rather than trying to do anything more sophisticated. One or two cherry-picks required minor adjustment; since I'd already had to deal with them during the rebase, it was easy to fix things up.

While having done the rebase helped me deal with the conflicts during cherry-picking, the cherry-picking still felt much easier. I could have arrived at the same place with an interactive rebase (which would have let me drop modifications I'd decided I either didn't want or didn't care about), but I think it would have felt more indirect and chancy. Cherry-picking more directly expressed my intentions; I wanted my change and then this, this, and this from another tree. Done.

(In both cases, the git repo I wind up with probably can't be used for further rebases against Münnich's repo, just for rebases against the main repo.)

Stepping back, thinking about what I wanted, and then finding the right mechanism to express this in Git worked out very well. When I switched from rebasing to cherry-picking, I went from feeling that I was fighting git to get what I wanted to feeling that I was directly and naturally expressing myself and git was doing just what I wanted. Of course the real trick here is having the Git knowledge and experience to realize what the good way is; had I not had some experience, I might not have been familiar enough with cherry-picking to reach for it here. And there are undoubtedly Git manipulations that I don't even know exist, so I'll never pick them as the right option.

(As a side note, this isn't really the copy versus move situation that I ran into before. Instead it's much more that I'm gluing together a completely new branch that happens to be made with bits and pieces from some other branches (and after I'm done the other branches aren't of direct interest to me).)

programming/GitPickingRightApproach written at 22:25:52; Add Comment

People probably aren't going to tell you when your anti-spam systems are working

As part of yesterday's entry about how we're now using the spam scores generated by the university's central email system, I was going to say that our users seem happy with the results. However, I have to admit that this is not quite true. We don't explicitly know that they're happy; instead, what we know is that they've stopped reporting that they're getting too much spam and can we do something about that.

What I've come to expect is pretty straightforward, namely that users generally aren't going to give us feedback when our anti-spam systems are working well. And why should they? Really, a spam-free email system where all the email they want gets delivered with no false positives is just how things should be, and you don't generally tell people 'good job, the systems are working just how they're supposed to work'. Naturally, people are generally only going to tell you when something goes wrong, either what they think of as an excess of spam or when email they're expecting doesn't get through or gets bounced.

On the one hand, this can be a bit frustrating when (or if) we want to know if some theoretically clever trick we've added to the mail system is making people's email more pleasant. On the other hand, this means that no news is good news; if we're not getting complaints about spam or missing email, we're most likely (still) doing things right. If we change something and nobody says anything, at a minimum our change did no harm.

As a side note, it's probably not reliable to count on users to start complaining if (or when) the amount of spam they see goes up. By now, some number of users have been trained to expect a certain amount of spam in their inboxes, and they won't start complaining out loud until the spam getting through really gets excessive.

(They may click on 'this is spam' or 'mark as spam' buttons if those buttons are conveniently available to them in their mail environment, especially if the buttons appear to do something. If you have such buttons, monitoring how often users click them can likely give you an early warning indicator of increased spam getting through your filters. Or at least email that your users don't want.)

spam/UsersAndAntiSpamFeedback written at 00:52:00; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.