Wandering Thoughts archives

2023-01-28

I should assume contexts aren't retained in Go APIs

Over on the Fediverse, I said something about some Go APIs I'd run into:

Dear everyone making Go APIs that take a context argument and return something you use to make further API calls (as methods): please, I beg you, document the scope of the context. Does it apply to only the initial setup API call? Does it apply to all further API calls through the object your setup function returned? I don't want to have to read your code to find this out. (Or to have it change.)

I was kind of wrong here. While I still feel that your documentation might want to say something about this, I've come around to realizing that I should assume the default is that contexts are not retained. In fact this is what the context package and the Go blog's article on contexts and structs say you should do.

I'll start with the potential confusion as a consumer of an API. Suppose that the API looks like this:

handle := pkg.ConnectWithContext(ctx, ...)
r, e := handle.Operation(...)
[...]
r, e := handle.AQuery(...)

Your (my) question is how much does the context passed to ConnectWithContext() cover. It could cover only the work to set up the initial connection, or it could cover everything done with the handle in the future. The first allows fine-grained control, while the second allows people to easily configure a large scale timeout or cancellation. What the Go documentation and blog post tell you to do is the first option. If people using the API want a global timeout on all of their API operations, they should set up a context for this and pass it to each operation done through the handle, and thus every handle method should take a context argument.

Because you can build the 'one context to cover everything' usage out of the 'operations don't retain contexts' API but not the other way around, the latter is more flexible (as well as being what Go recommends). So this should be my default assumption when I run into an API that uses contexts, especially if every operation on the handle also takes a context.

As far as documentation goes, maybe it would be nice if the documentation mentioned this in passing (even with 'contexts are used in the standard Go way so they only cover each operation' as part of the general package documentation), or maybe this is now just what people working regularly in Go (and reading Go APIs) just assume. For what it's worth, the net's package DialContext() does mention that the context isn't retained, but then it was added very early in the life of contexts, before they were as well established and as well known as they are now.

I feel the ice is thinner for documentation if the methods on the handle don't take a context (and aren't guaranteed to be instant and always succeed). Then people might at least wonder if the handle retains the original context used to establish it, because otherwise you have no way to cancel or time out those operations. But I suspect such APIs are uncommon unless the API predates contexts and is strongly limited by backward compatibility.

(These days Go modules allow you to readily escape that backward compatibility if you want to; you can just bump your major version to v2 and add context arguments to all the handle methods.)

(Now that I've written this down for myself, hopefully I'll remember it in the future when I'm reading Go APIs.)

GoContextsAssumeNotRetained written at 22:27:31; Add Comment

2023-01-16

Backporting changes is clearly hard, which is a good reason to avoid it

Recently, the Linux 6.0 kernel series introduced a significant bug in 6.0.16. The bug was introduced when a later kernel change was backported to 6.0.16 with an accidental omission (cf). There are a number of things you can draw from this issue, but the big thing I take away from it is that backporting changes is hard. The corollary of this is that the more changes you ask people to backport (and to more targets), the more likely you are to wind up with bugs, simply through the law of large numbers. The corollary to the corollary is that if you want to keep bugs down, you want to limit the amount of backporting you do or ask for.

(The further corollary is that the more old versions you ask people to support (and the more support you want for those old versions), the more backports you're asking them to do and the more bugs you're going to get.)

I can come up with various theories why backporting changes is harder than making changes in the first place. For example, when you backport a change you generally need to understand the context of more code; in addition to understanding the current code before the change and the change itself, now you need to understand the old code that you're backporting to. Current tools may not make it particularly easy to verify that you've gotten all of the parts of a change and have not, as seems to have happened here, dropped a line. And if there's been any significant code reorganization, you may be rewriting the change from scratch instead of porting it, working from the intention of the change (if you fully understand it).

(Here, there is still an net_csk_get_port() function in 6.0.16 but it doesn't quite look like the version the change was made to so the textual patch doesn't apply. See the code in 6.0.19 and compare it to the diff in the 6.1 patch or the original mainstream commit.)

Some people will say that backports should be done with more care, or that there should be more tests, or some other magical fix. But the practical reality is that they won't be. What we see today is what we're going to continue getting in the future, and that's some amount of errors in backported changes, with the absolute number of errors rising as the number of changes rises. We can't wish this away with theoretical process improvements or by telling people to try harder.

(I don't know if there are more errors in backported changes than there are in changes in general. But generally speaking the changes that are being backported are supposed to be the ones that don't introduce errors, so we're theoretically starting from a baseline of 'no errors before we mangle something in the backport'.)

PS: While I don't particularly like its practical effects, this may make me a bit more sympathetic toward OpenBSD's support policy. OpenBSD has certainly set things up so they make minimal changes to old versions and thus have minimal need to backport changes.

BackportsAreHard written at 22:47:53; Add Comment

2023-01-10

My Git settings for carrying local changes on top of upstream development

For years now I've been handling my local changes to upstream projects by committing them and rebasing on (Git) pulls, and it's been a positive experience. However, over the years the exact Git configuration settings I wanted to make this work smoothly have changed (due to things such as Git 2.34's change in fast-forward pull settings), and I've never written down all of the settings in one place. Since I recently switched to Git for a big repository where I carry local changes, this is a good time to write them down for my future reference.

My basic settings as of Fedora's Git 2.39.0 are:

git config pull.rebase true
git config pull.ff true

The second is there to deal with the Git 2.34 changes, since I have a global pull.ff setting of 'only'.

If you're working with a big repository where you don't really care about which files changed when you pull an upstream update and that output is way too verbose, the normal rebase behavior gives you this. You may also want the following:

git config merge.stat off

This is mostly a safety measure, because normally you aren't merging, you're rebasing. If you want to still see 'git pull' like output, you want to set 'git config rebase.stat true' (as I started years ago).

If you have a repository with a lot of active branches, you may also want:

git config fetch.output compact

This mostly helps if branch names are very long, long enough to overflow a terminal line; otherwise you get the same number of lines printed, they're just shorter. Unfortunately you can't easily get tracking upstream repositories to be significantly more quiet.

If your upstream goes through a churn of branches, you (I) will want to prune now-deleted remote branches when you fetch updates. The condensed version is:

git config remote.origin.prune true

With more Git work, it's possible to pull only the main branch. If I'm carrying local changes on top of the main branch and other branches are actively doing things, this may be what I want. It's relatively unlikely that I'll switch to another branch, since it won't have my changes.

(This may turn out to be what I want with the Mozilla gecko-dev repository, but it does have Firefox Beta and Release as well, and someday I may want convenient Git access to them. I'm not sure how to dig out specific Firefox releases, though.)

GitRebaseLocalChangesSetup written at 23:04:41; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.