Wandering Thoughts archives

2024-05-25

Reasons to not expose Go's choice of default TLS ciphers

When I wrote about the long-overdue problem people are going to have with go:linkname in Go 1.23, the specific case that caused me to notice this was something trying to access crypto/tls's 'defaultCipherSuitesTLS13' variable. As its name suggests, this variable holds the default cipher suites used by Go for TLS 1.3. One reaction to this specific problem is to ask why Go doesn't expose this information as part of crypto/tls's API.

One reason why not is contained in the documentation for crypto/tls.CipherSuites():

[...] Note that the default cipher suites selected by this package might depend on logic that can't be captured by a static list, and might not match those returned by this function.

In fact the TLS 1.3 cipher suites that Go uses may not match the ones in defaultCipherSuitesTLS13, because there is actually a second set of them, defaultCipherSuitesTLS13NoAES. As its name suggests, this set of cipher suites applies when the current machine doesn't have hardware support for AES GCM, or at least hardware support that Go recognizes. Well, even that is too simple a description; if Go is being used as a TLS server, whether the 'no AES GCM' version is used also depends on if the client connecting to the Go server appears to prefer AES GCM (likely signaling that the client has hardware support for it).

Today, Go can't expose a useful API for 'the default TLS 1.3 cipher suites' because there is no such straightforward thing; the actual default cipher suites used depend on multiple factors, some of which can't be used by even a top level function like CipherSuites(). If Go had exported such a variable or API in the past, Go's general attitude on backward compatibility might have forced it to freeze the logic of TLS 1.3 cipher suite choice so that it did respect this default list no matter what, much like the random number generation algorithm became frozen because people depended on it.

The Go 1 compatibility promise is a powerful enabler for Go. But because it is so strong and the Go developers interpret it broadly, it means that Go has to be really careful and certain about what APIs it exposes. As we've seen with math/rand, early decisions to expose certain things, even implicitly, can later constrain Go's ability to make important changes.

PS: Another reason to not expose this information from crypto/tls is that Go has explicitly decided to not make TLS 1.3 cipher suites something that you can control. As covered in the documentation for crypto/tls.Config, the 'CipherSuites' struct element is ignored for TLS 1.3; TLS 1.3 cipher suites are not configurable.

GoWhyNotExposeDefaultCiphers written at 22:13:09;

2024-05-24

The long-overdue problem coming for some people in Go 1.23

Right now, if you try to build anything using the currently released version of github.com/quic-go/quic-go with the current development version of Go (such as this DNS query program), you will probably encounter the following error:

link: github.com/quic-go/quic-go/internal/qtls: invalid reference to crypto/tls.defaultCipherSuitesTLS13

Experienced Go developers may now be scratching their heads about how quic-go/internal/qtls is referring to crypto/tls.defaultCipherSuitesTLS13, since the latter isn't an exported identifier (in Go, all exported identifiers start with a capital letter). The simple answer is that the qtls package is cheating (in cipher_suite.go).

The official Go compiler has a number of special compiler directives. A few of them are widely known and used, for example '//go:noinline' is common in some benchmarking to stop the compiler from optimizing your test functions too much. One of the not well known ones is '//go:linkname', and for this I'll just quote from its documentation:

//go:linkname localname [importpath.name]

[...] This directive determines the object-file symbol used for a Go var or func declaration, allowing two Go symbols to alias the same object-file symbol, thereby enabling one package to access a symbol in another package even when this would violate the usual encapsulation of unexported declarations, or even type safety. For that reason, it is only enabled in files that have imported "unsafe".

Let me translate that: go:linkname allows you to access unexported variables and functions of other packages. In particular, it allows you to access unexported variables (and functions) from the Go standard library, such as crypto/tls.defaultCipherSuitesTLS13.

The Go standard library uses go:linkname internally to access various unexported things from other packages and from the core runtime, which is perfectly fair; the entire standard library is developed by the same people, and they have to be very careful and conservative with the public API. However, go:linkname has also been used by a wide assortment of third party packages to access unexported pieces of the standard library that those packages found convenient or useful (such as Go's default cipher suites for TLS 1.3). Accessing unexported things from the Go standard library isn't covered by the Go 1 compatibility guarantee, for obvious reasons, but in practice the Go developers find themselves not wanting to break too much of the Go package ecosystem even if said ecosystem is doing unsupported things.

Last week, the Go developers noticed this (I believe not for the first time) and Russ Cox filed issue #67401: cmd/link: lock down future uses of linkname, where you can find a thorough discussion of the issue. The end result is that the current development version of Go, which will become Go 1.23, is now much more restrictive about go:linkname, requiring that the target symbol opt in to this usage. Starting from Go 1.23, you will not be able to 'go:linkname' to things in the standard library that have not specifically allowed this (and the rules are probably going to get stricter in future Go versions; in a few versions I wouldn't be surprised if you couldn't go:linkname into the standard library at all from outside packages).

So this is what is happening with github.com/quic-go/quic-go. It is internally using a go:linkname to get access to crypto/tls's defaultCipherSuitesTLS13, but in Go 1.23, defaultCipherSuitesTLS13 is not one of the symbols that has opted in to this use, so the build is now failing. The quic-go package is probably far from the only package that is going to get caught out by this, now and in the future.

(The Go developers have been adding specific opt-ins for sufficiently used internal identifiers, in files generally called 'badlinkname.go' in the packages. You can see the current state for crypto/tls in its badlinkname.go file.)

Go123LinknameComingProblem written at 22:06:10;

2024-05-21

Go's old $GOPATH story for development and dependencies

As people generally tell the story today, Go was originally developed without support for dependency management. Various community efforts evolved over time and then were swept away in 2019 by Go Modules, which finally added core support for dependency management. I happen to feel that this story is a little bit incomplete and sells the original Go developers short, because I think they did originally have a story for how Go development and dependency management was supposed to work. To me, one of the fascinating bits in Go's evolution to modules is how that original story didn't work out. Today I'm going to outline how I see that original story.

In Go 1.0, the idea was that you would have one or more of what are today called multi-module workspaces. Each workspace contained one (or several) of your projects and all of its dependencies, in the form of cloned and checked-out repositories. With separate repositories, each workspace could have different (and independent) versions of the same packages if you needed that, and updating the version of one dependency in one workspace wouldn't update any other workspace. Your current workspace would be chosen by setting and changing $GOPATH, and the workspace would contain not just the source code but also precompiled build artifacts, built binaries, and so on, all hermetically confined under its $GOPATH.

This story of multiple $GOPATH workspaces allows each separate package or package set of yours to be wrapped up in a directory hierarchy that effectively has all of its dependencies 'vendored' into it. If you want to preserve this for posterity or give someone else a copy of it, you can archive or send the whole directory tree, or at least the src/ portion of it. The whole thing is fairly similar to a materialized Python virtual environment.

(The original version of Go did not default $GOPATH to $HOME/go, per for example the Go 1.1 release notes. It would take until Go 1.8 for this default to be added.)

This story broadly assumes that updates to dependencies will normally be compatible, because otherwise you really want to track the working dependency versions even in a workspace. While you can try to update a dependency and then roll it back (since you normally have its checked out repository with full history), Go won't help you by remembering the identity of the old, working version. It's up to you to dig this out with tools like the git reflog or your own memory that you were at version 'x.y.z' of the package before you updated it. And 'go get -u' to update all your dependencies at once only makes sense if their new versions will normally all work.

This story also leaves copying workspaces to give them to someone else (or to preserve them in their current state) as a problem for you, not Go. However, Go did add 'experimental' support for vendoring dependencies in Go 1.5, which allowed people to create self-contained objects that could be used with 'go get' or other simple repository copying and cloning. A package that had its dependencies fully vendored was effectively a miniature workspace, but this approach had some drawbacks of its own.

I feel this original story, while limited, is broadly not unreasonable. It could have worked, at least in theory, in a world where preserving API compatibility (in a broad sense) is much more common than it clearly is (or isn't) in this one.

GoTheGopathDevelopmentStory written at 23:37:24;

2024-05-19

My GNU Emacs MH mail folder completion in MH-E

When I wrote about understanding the orderless package, I mentioned that orderless doesn't work well with hierarchical completions such as file names, which are completed one component at a time. I also said this mattered to me because MH-E completed the names of mail folders in this part by part manner, but I didn't feel like rewriting MH-E's folder completion system to fix it. Well, you can probably guess what happened next.

In the GNU Emacs way, I didn't so much rewrite MH-E's mail folder completion as add a second folder completion system along side it, and then rebound some keys to use my system. Writing my system was possible because it turned out MH-E had already done most of the work for me, by being able to collect a complete list of all folder names (which it used to support its use of the GNU Emacs Speedbar).

To put the summary up front, I was pleasantly surprised by how easy it was to add my own completion stuff and make use of it within my MH-E environment. At the same time, reverse engineering some of MH-E's internal data structures was a bit annoying and it definitely feels like a bit of a hack (although one that's unlikely to bite me; MH-E is not exactly undergoing rapid and dramatic evolution these days, so those data structures are unlikely to change).

There are many sophisticated way to do minibuffer completion in GNU Emacs, but if your purpose is to work well with orderless, the simplest approach is to generate a list of all of your completion candidates up front and then provide this list to completing-read. This results in code that looks like this:

(defvar cks/mh-folder-history '() "History of MH folder targets.")
(defun cks/mh-get-folder (msg)
  (let ((cks/completion-category 'mh-e-folder-full))
    (completing-read msg (cks/mh-all-folders) nil t "+" cks/mh-folder-history)))

Here I've made the decision that this completion interface should require that I select an existing MH mail folder, to avoid problems. If I want to create a new mail folder I fall back to the standard MH-E functions, with their less convenient completion but greater freedom. I've also decided to give this completion a history, so I can easily re-use my recent folder destinations.

(The cks/completion-category stuff is for forcing the minibuffer completion category so that I can customize how vertico presents it, including listing those recent folder destinations first.)

This 'get MH folder' function is then used in a straightforward way:

(defun mh-refile-msg-full (range folder)
  (interactive (list (mh-interactive-range "Refile")
                     (intern (cks/mh-get-folder "Refile to folder? "))))
  (mh-refile-msg range folder))

This defers all of the hard work to the underlying MH-E command for refiling messages. This is one of the great neat tricks in GNU Emacs with the (interactive ...) form; when you make a function a command with (interactive ...), it's natural to find up with it callable from other ELisp code with the arguments you'd normally be prompted for interactively. So I can reuse the mh-refile-msg command non-interactively, sticking my own interactive frontend on it.

All of the hard work happens in cks/mh-all-folders. Naturally, MH-E maintains its own data structures in a way that it finds convenient, so its 'mh-sub-folders-cache' hash table is not structured as a list of all MH folder names but instead has hash entries storing all of the immediate child folders of a parent plus some information on each (at the root, the 'parent' is nil). So we start with a function to transform various combination of a hash key and a hash value into a MH folder name:

(defun cks/mh-hash-folder-name (key elem)
  (cond
   ((and key elem) (concat key "/" (car elem)))
   (key key)
   (elem (concat "+" (car elem)))))

And then we go over mh-sub-folders-cache using our mapping function with:

(cl-loop for key being the hash-keys of mh-sub-folders-cache
  using (hash-values v)
  collect (cks/mh-hash-folder-name key nil)
  append (cl-loop for sub in v
		  collect (cks/mh-hash-folder-name key sub)))))

After getting this list we need to sort it alphabetically, and also remove duplicate entries just in case (and also a surplus nil entry), using the following:

(sort (remq nil (seq-uniq flist)) 'string-lessp)

(Here, 'flist' is the let variable I have stuck the cl-loop result into. My actual code then removes some folder names I don't want to be there cluttering up the completion list for various reasons.)

There are some additional complications because MH-E will invalidate bits of its sub-folders cache every so often, so we may need to force the entire cache to be rebuilt from scratch (which requires some hackery, but turns out to be very fast these days). I'm not putting those relatively terrible hacks down here (also, the whole thing is somewhat long).

(If I was a clever person I would split this into two functions, one of which generated the full MH mail folder list and the second of which filtered out the stuff I don't want in it. Then I could publish the first function for people's convenience, assuming that anyone was interested. However, my ELisp often evolves organically as I realize what I want.)

EmacsMyMHFolderCompletion written at 23:38:41;

By day for May 2024: 19 21 24 25; before May; after May.

Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.