Wandering Thoughts

2024-07-13

That software forges are often better than email is unfortunate

Over on the Fediverse, there was a discussion of doing software development things using email and I said something:

My heretical opinion is that I would rather file a Github issue against your project than send you or your bug tracker email, because I do not trust you to safeguard my email against spammers, so I have to make up an entire new email address for you and carefully manage it. I don't trust Github either, but I have already done all of this email address handling for them.

(I also make up an email address for my Git commits. And yes, spammers have scraped it and spammed me.)

Github is merely a convenient example (and the most common one I deal with). What matters is that the forge is a point of centralization (so it covers a lot of projects) and that it does not require me to expose my email to lots of people. Any widely used forge-style environment has the same appeal (and conversely, small scale forges do not; if I am going to report issues to only one project per forge, it is not much different than a per-project bug tracker or bug mailing list).

That email is so much of a hassle today is a bad and sad thing. Email is a widely implemented open standard with a huge suite of tools that allows for a wide range of ways of working with it. It should be a great light-weight way of sending in issues, bug reports, patches, etc etc, and any centralized, non-email place to do this (like Github) has a collection of potential problems that should make open source/free software people nervous.

Unfortunately email has been overrun by spammers in a way that forges have not (yet) been, and in the case of email the problem is essentially intractable. Even my relatively hard to obtain Github-specific email address gets spam email, and my Git commit email address gets more. And demonstrating the problem with not using forges, the email address I used briefly to make some GNU Emacs bug reports about MH-E got spam almost immediately, which shows why I really don't want to have to send my issues by email to an exposed mailing list with public archives.

While there are things that might make the email situation somewhat better (primarily by hiding your email address from as many parties as possible), I don't think there's any general fix for the situation. Thanks to spam and abuse, we're stuck with a situation where setting yourself up on a few central development sites with good practices about handling your contact methods is generally more convenient than an open protocol, especially for people who don't do this all the time.

EmailVsForgesUnfortunate written at 23:09:48; Add Comment

2024-07-07

I think (GNU) Emacs bankruptcy is inevitable in the longer term

Recently I read Avoiding Emacs bankruptcy, with good financial habits (via). To badly summarize the article, it suggests avoiding third party packages and minimizing the amount of customization you do. As it happens, I have experience with more or less this approach, and in the end it didn't help. Because I built my old Emacs environment in the days before third party Emacs package management, it didn't include third party packages (although it may have had a few functions I'd gotten from other people). And by modern standards it wasn't all that customized, because I didn't go wildly rebinding keys or the like. Instead, I mostly did basic things like set indentation styles. But over the time from Emacs 18 to 2012, even that stuff stopped working. The whole experience has left me feeling that Emacs bankruptcy is inevitable over the longer term.

The elements pushing towards Emacs bankruptcy are relatively straightforward. First, Emacs wants personal customization in practice, so you will build up a .emacs for your current version of Emacs even if you don't use third party packages. Second, Emacs itself changes over time, or if you prefer the standard, built-in packages change over time to do things like handle indentation and mail reading better. This means that your customizations of them will need updating periodically. Third, the Emacs community changes over time in terms of what people support, talk about, recommend, and so on. If you use the community at all for help, guidance, and the like, what it will be able to help you with and what it will suggest will change over time, and thus so will what you want in your Emacs environment to go with it. Finally, both your options for third party packages and the third party packages themselves will change over time, again forcing you to make changes in your Emacs environment to compensate.

In addition, as the article implicitly admits, that a package is in the Emacs standard library doesn't mean that it can't have problems or effectively be abandoned with little or no changes and updates (for example, the state of Flymake for a long time). Sticking to the packages that come with Emacs can be limiting and restrictive, much like not customizing Emacs at all and accepting all of its defaults. You can work with Emacs that way (and people used to, back in the days before there was a vibrant ecology of third party packages), but you're being potentially hard on yourself in order to reduce the magnitude of something that's probably going to happen to you anyway.

(For instance, until recently not using third party packages would have meant that you did not have support for language servers.)

My view is that in practice, there's no way to leave your Emacs setup alone for a long time. You can go 'bankrupt' in small pieces of work every so often, or in big bangs like the one I went through (although the small pieces approach is more likely if you keep using Emacs regularly).

I don't think this is a bad thing. It's ultimately a choice on the spectrum between evolution and backward compatibility, where both GNU Emacs and the third party ecosystem would rather move forward (hopefully for the better) instead of freezing things once something is implemented.

EmacsBankruptcyInevitable written at 22:30:27; Add Comment

2024-06-24

(GNU) Emacs wants personal customization in practice

Recently I read Avoiding Emacs bankruptcy, with good financial habits (via), which sparked some thoughts. One of them is that I feel that GNU Emacs is an editor that winds up with personal customizations from people who use it, even if you don't opt to install any third party packages and stick purely with what comes with Emacs.

There are editors that you can happily use in their stock or almost stock configuration; this is most of how I use vim. In theory you can use Emacs this way too. In practice I think that GNU Emacs is not such an editor. You can use GNU Emacs without any customization and it will edit text and do a variety of useful things for you, but I believe you're going to run into a variety of limitations with the result that will push you towards at least basic customization of built in settings.

I believe that there are multiple issues, at least:

  • The outside world can have multiple options where you have to configure the choice (such as what C indentation style to use) that matches your local environment.

  • Emacs (and its built in packages) are opinionated and those opinions are not necessarily yours. If opinions clash enough, you'll very much want to change some settings to your opinions.

    (This drove a lot of my customization of GNU Emacs' MH-E mode, although some of that was that I was already a user of (N)MH.)

  • You want to (automatically) enable certain things that aren't on by default, such as specific minor modes or specific completion styles. Sure, you can turn on appealing minor modes by hand, but this gets old pretty fast.

  • Some things may need configuration and have no defaults that Emacs can provide, so either you put in your specific information or you don't get that particular (built in) package working.

Avoiding all of these means using GNU Emacs in a constrained way, settling for basic Emacs style text editing instead of the intelligent environment that GNU Emacs can be. Or to put it another way, Emacs makes it appealing to tap into its power with only a few minor settings through the built in customization system (at least initially).

I believe that most people who pick GNU Emacs and stick with it want to use something like its full power and capability; they aren't picking it up as a basic text editor. Even without third party packages, this leads them to non-trivial customizations to their specific environment, opinions, and necessary choices.

(Perhaps this is unsurprising and is widely accepted within the GNU Emacs community. Or perhaps there is a significant sub-community that does use GNU Emacs only in its role as a basic text editor, without the various superintelligence that it's capable of.)

EmacsCallsForCustomization written at 23:41:11; Add Comment

2024-06-17

Go's 'range over functions' iterators and avoiding iteration errors

Go is working on allowing people to range-over function iterators, and currently this is scheduled to be in Go 1.23, due out this summer (see issue 61405 and issue 61897). The actual implementation is somewhat baroque and some people have been unhappy about that (for example). My view is that this is about bringing user-written container types closer to parity with the special language container types, but recently another view of this occurred to me.

As people have noted, what is most special about this proposal is not that it creates an officially supported iteration protocol in Go, but that this protocol gets direct language support. The compiler itself will transform 'for ... = range afunction' into different code that actively implements the iteration protocol that the Go developers have chosen. This direct language support is critical to making ranging over functions like 'for k,v := range map', but it also does another thing, which is that it implements all of the details of the iteration protocol for the person writing the 'for' loop.

(People seem to generally envision that the actual usage will be 'for ... = range generator(...)', where 'generator()' is a function that returns the actual function that is used for iteration. But I think you could use method values in some situations.)

Iteration protocols are generally fairly complicated. They have to deal with setup, finalization, early exits from process of iteration, finalization in the face of early exits, and so on. The actual implementations of these protocols tends to be gnarly and somewhat subtle, with various potential mistakes and omissions that can be made, and some of these will not manifest in clear bugs until some special situation arises. Go could make everyone who wanted to use 'iterate over a function or special data structure' write out the explicit code needed to do this using the protocol, but if it did we know what the result would be; some of that code would be buggy and incomplete.

By embedding its chosen iteration protocol into the language itself, Go insures that most of that code won't have to be written by you (or by any of the plenty of people who might use user-written types and want to iterate over them). The compiler itself will take a straightforward 'for ... range' block and transform it to correctly and completely implement the protocol. In fact, the protocol is not even particularly accessible to you within the 'for' block you're writing.

People writing the iterator functions for their user-written types will have to care about the protocol, of course (although the Go protocol seems relatively simple in that regard too). But there are likely to be many fewer such iterator creators than there will be iterator users, much as Go assumes that there will be many more people using generic types than people creating them.

GoIteratorsAndAvoidingMistakes written at 23:10:08; Add Comment

2024-05-25

Reasons to not expose Go's choice of default TLS ciphers

When I wrote about the long-overdue problem people are going to have with go:linkname in Go 1.23, the specific case that caused me to notice this was something trying to access crypto/tls's 'defaultCipherSuitesTLS13' variable. As its name suggests, this variable holds the default cipher suites used by Go for TLS 1.3. One reaction to this specific problem is to ask why Go doesn't expose this information as part of crypto/tls's API.

One reason why not is contained in the documentation for crypto/tls.CipherSuites():

[...] Note that the default cipher suites selected by this package might depend on logic that can't be captured by a static list, and might not match those returned by this function.

In fact the TLS 1.3 cipher suites that Go uses may not match the ones in defaultCipherSuitesTLS13, because there is actually a second set of them, defaultCipherSuitesTLS13NoAES. As its name suggests, this set of cipher suites applies when the current machine doesn't have hardware support for AES GCM, or at least hardware support that Go recognizes. Well, even that is too simple a description; if Go is being used as a TLS server, whether the 'no AES GCM' version is used also depends on if the client connecting to the Go server appears to prefer AES GCM (likely signaling that the client has hardware support for it).

Today, Go can't expose a useful API for 'the default TLS 1.3 cipher suites' because there is no such straightforward thing; the actual default cipher suites used depend on multiple factors, some of which can't be used by even a top level function like CipherSuites(). If Go had exported such a variable or API in the past, Go's general attitude on backward compatibility might have forced it to freeze the logic of TLS 1.3 cipher suite choice so that it did respect this default list no matter what, much like the random number generation algorithm became frozen because people depended on it.

The Go 1 compatibility promise is a powerful enabler for Go. But because it is so strong and the Go developers interpret it broadly, it means that Go has to be really careful and certain about what APIs it exposes. As we've seen with math/rand, early decisions to expose certain things, even implicitly, can later constrain Go's ability to make important changes.

PS: Another reason to not expose this information from crypto/tls is that Go has explicitly decided to not make TLS 1.3 cipher suites something that you can control. As covered in the documentation for crypto/tls.Config, the 'CipherSuites' struct element is ignored for TLS 1.3; TLS 1.3 cipher suites are not configurable.

GoWhyNotExposeDefaultCiphers written at 22:13:09; Add Comment

2024-05-24

The long-overdue problem coming for some people in Go 1.23

Right now, if you try to build anything using the currently released version of github.com/quic-go/quic-go with the current development version of Go (such as this DNS query program), you will probably encounter the following error:

link: github.com/quic-go/quic-go/internal/qtls: invalid reference to crypto/tls.defaultCipherSuitesTLS13

Experienced Go developers may now be scratching their heads about how quic-go/internal/qtls is referring to crypto/tls.defaultCipherSuitesTLS13, since the latter isn't an exported identifier (in Go, all exported identifiers start with a capital letter). The simple answer is that the qtls package is cheating (in cipher_suite.go).

The official Go compiler has a number of special compiler directives. A few of them are widely known and used, for example '//go:noinline' is common in some benchmarking to stop the compiler from optimizing your test functions too much. One of the not well known ones is '//go:linkname', and for this I'll just quote from its documentation:

//go:linkname localname [importpath.name]

[...] This directive determines the object-file symbol used for a Go var or func declaration, allowing two Go symbols to alias the same object-file symbol, thereby enabling one package to access a symbol in another package even when this would violate the usual encapsulation of unexported declarations, or even type safety. For that reason, it is only enabled in files that have imported "unsafe".

Let me translate that: go:linkname allows you to access unexported variables and functions of other packages. In particular, it allows you to access unexported variables (and functions) from the Go standard library, such as crypto/tls.defaultCipherSuitesTLS13.

The Go standard library uses go:linkname internally to access various unexported things from other packages and from the core runtime, which is perfectly fair; the entire standard library is developed by the same people, and they have to be very careful and conservative with the public API. However, go:linkname has also been used by a wide assortment of third party packages to access unexported pieces of the standard library that those packages found convenient or useful (such as Go's default cipher suites for TLS 1.3). Accessing unexported things from the Go standard library isn't covered by the Go 1 compatibility guarantee, for obvious reasons, but in practice the Go developers find themselves not wanting to break too much of the Go package ecosystem even if said ecosystem is doing unsupported things.

Last week, the Go developers noticed this (I believe not for the first time) and Russ Cox filed issue #67401: cmd/link: lock down future uses of linkname, where you can find a thorough discussion of the issue. The end result is that the current development version of Go, which will become Go 1.23, is now much more restrictive about go:linkname, requiring that the target symbol opt in to this usage. Starting from Go 1.23, you will not be able to 'go:linkname' to things in the standard library that have not specifically allowed this (and the rules are probably going to get stricter in future Go versions; in a few versions I wouldn't be surprised if you couldn't go:linkname into the standard library at all from outside packages).

So this is what is happening with github.com/quic-go/quic-go. It is internally using a go:linkname to get access to crypto/tls's defaultCipherSuitesTLS13, but in Go 1.23, defaultCipherSuitesTLS13 is not one of the symbols that has opted in to this use, so the build is now failing. The quic-go package is probably far from the only package that is going to get caught out by this, now and in the future.

(The Go developers have been adding specific opt-ins for sufficiently used internal identifiers, in files generally called 'badlinkname.go' in the packages. You can see the current state for crypto/tls in its badlinkname.go file.)

Go123LinknameComingProblem written at 22:06:10; Add Comment

2024-05-21

Go's old $GOPATH story for development and dependencies

As people generally tell the story today, Go was originally developed without support for dependency management. Various community efforts evolved over time and then were swept away in 2019 by Go Modules, which finally added core support for dependency management. I happen to feel that this story is a little bit incomplete and sells the original Go developers short, because I think they did originally have a story for how Go development and dependency management was supposed to work. To me, one of the fascinating bits in Go's evolution to modules is how that original story didn't work out. Today I'm going to outline how I see that original story.

In Go 1.0, the idea was that you would have one or more of what are today called multi-module workspaces. Each workspace contained one (or several) of your projects and all of its dependencies, in the form of cloned and checked-out repositories. With separate repositories, each workspace could have different (and independent) versions of the same packages if you needed that, and updating the version of one dependency in one workspace wouldn't update any other workspace. Your current workspace would be chosen by setting and changing $GOPATH, and the workspace would contain not just the source code but also precompiled build artifacts, built binaries, and so on, all hermetically confined under its $GOPATH.

This story of multiple $GOPATH workspaces allows each separate package or package set of yours to be wrapped up in a directory hierarchy that effectively has all of its dependencies 'vendored' into it. If you want to preserve this for posterity or give someone else a copy of it, you can archive or send the whole directory tree, or at least the src/ portion of it. The whole thing is fairly similar to a materialized Python virtual environment.

(The original version of Go did not default $GOPATH to $HOME/go, per for example the Go 1.1 release notes. It would take until Go 1.8 for this default to be added.)

This story broadly assumes that updates to dependencies will normally be compatible, because otherwise you really want to track the working dependency versions even in a workspace. While you can try to update a dependency and then roll it back (since you normally have its checked out repository with full history), Go won't help you by remembering the identity of the old, working version. It's up to you to dig this out with tools like the git reflog or your own memory that you were at version 'x.y.z' of the package before you updated it. And 'go get -u' to update all your dependencies at once only makes sense if their new versions will normally all work.

This story also leaves copying workspaces to give them to someone else (or to preserve them in their current state) as a problem for you, not Go. However, Go did add 'experimental' support for vendoring dependencies in Go 1.5, which allowed people to create self-contained objects that could be used with 'go get' or other simple repository copying and cloning. A package that had its dependencies fully vendored was effectively a miniature workspace, but this approach had some drawbacks of its own.

I feel this original story, while limited, is broadly not unreasonable. It could have worked, at least in theory, in a world where preserving API compatibility (in a broad sense) is much more common than it clearly is (or isn't) in this one.

GoTheGopathDevelopmentStory written at 23:37:24; Add Comment

2024-05-19

My GNU Emacs MH mail folder completion in MH-E

When I wrote about understanding the orderless package, I mentioned that orderless doesn't work well with hierarchical completions such as file names, which are completed one component at a time. I also said this mattered to me because MH-E completed the names of mail folders in this part by part manner, but I didn't feel like rewriting MH-E's folder completion system to fix it. Well, you can probably guess what happened next.

In the GNU Emacs way, I didn't so much rewrite MH-E's mail folder completion as add a second folder completion system along side it, and then rebound some keys to use my system. Writing my system was possible because it turned out MH-E had already done most of the work for me, by being able to collect a complete list of all folder names (which it used to support its use of the GNU Emacs Speedbar).

To put the summary up front, I was pleasantly surprised by how easy it was to add my own completion stuff and make use of it within my MH-E environment. At the same time, reverse engineering some of MH-E's internal data structures was a bit annoying and it definitely feels like a bit of a hack (although one that's unlikely to bite me; MH-E is not exactly undergoing rapid and dramatic evolution these days, so those data structures are unlikely to change).

There are many sophisticated way to do minibuffer completion in GNU Emacs, but if your purpose is to work well with orderless, the simplest approach is to generate a list of all of your completion candidates up front and then provide this list to completing-read. This results in code that looks like this:

(defvar cks/mh-folder-history '() "History of MH folder targets.")
(defun cks/mh-get-folder (msg)
  (let ((cks/completion-category 'mh-e-folder-full))
    (completing-read msg (cks/mh-all-folders) nil t "+" cks/mh-folder-history)))

Here I've made the decision that this completion interface should require that I select an existing MH mail folder, to avoid problems. If I want to create a new mail folder I fall back to the standard MH-E functions, with their less convenient completion but greater freedom. I've also decided to give this completion a history, so I can easily re-use my recent folder destinations.

(The cks/completion-category stuff is for forcing the minibuffer completion category so that I can customize how vertico presents it, including listing those recent folder destinations first.)

This 'get MH folder' function is then used in a straightforward way:

(defun mh-refile-msg-full (range folder)
  (interactive (list (mh-interactive-range "Refile")
                     (intern (cks/mh-get-folder "Refile to folder? "))))
  (mh-refile-msg range folder))

This defers all of the hard work to the underlying MH-E command for refiling messages. This is one of the great neat tricks in GNU Emacs with the (interactive ...) form; when you make a function a command with (interactive ...), it's natural to find up with it callable from other ELisp code with the arguments you'd normally be prompted for interactively. So I can reuse the mh-refile-msg command non-interactively, sticking my own interactive frontend on it.

All of the hard work happens in cks/mh-all-folders. Naturally, MH-E maintains its own data structures in a way that it finds convenient, so its 'mh-sub-folders-cache' hash table is not structured as a list of all MH folder names but instead has hash entries storing all of the immediate child folders of a parent plus some information on each (at the root, the 'parent' is nil). So we start with a function to transform various combination of a hash key and a hash value into a MH folder name:

(defun cks/mh-hash-folder-name (key elem)
  (cond
   ((and key elem) (concat key "/" (car elem)))
   (key key)
   (elem (concat "+" (car elem)))))

And then we go over mh-sub-folders-cache using our mapping function with:

(cl-loop for key being the hash-keys of mh-sub-folders-cache
  using (hash-values v)
  collect (cks/mh-hash-folder-name key nil)
  append (cl-loop for sub in v
		  collect (cks/mh-hash-folder-name key sub)))))

After getting this list we need to sort it alphabetically, and also remove duplicate entries just in case (and also a surplus nil entry), using the following:

(sort (remq nil (seq-uniq flist)) 'string-lessp)

(Here, 'flist' is the let variable I have stuck the cl-loop result into. My actual code then removes some folder names I don't want to be there cluttering up the completion list for various reasons.)

There are some additional complications because MH-E will invalidate bits of its sub-folders cache every so often, so we may need to force the entire cache to be rebuilt from scratch (which requires some hackery, but turns out to be very fast these days). I'm not putting those relatively terrible hacks down here (also, the whole thing is somewhat long).

(If I was a clever person I would split this into two functions, one of which generated the full MH mail folder list and the second of which filtered out the stuff I don't want in it. Then I could publish the first function for people's convenience, assuming that anyone was interested. However, my ELisp often evolves organically as I realize what I want.)

EmacsMyMHFolderCompletion written at 23:38:41; Add Comment

2024-04-14

(Probably) forcing Git to never prompt for authentication

My major use of Git is to keep copies of the public repositories of various projects from various people. Every so often, one of the people involved gets sufficiently irritated with the life of being an open source maintainer and takes their project's repository private (or their current Git 'forge' host does it for other reasons). When this happens, on my next 'git pull', Git skids to a halt with:

; git pull
Username for 'https://gitlab.com':

This is not a useful authentication prompt for me. I have no special access to these repositories; if anonymous access doesn't work, there is nothing I can enter for a username and password that will improve the situation. What I want is for Git to fail with a pull error, the same way it would if the repository URL returned a 404 or the connection to the host timed out.

(Git prompts you here because sometimes people do deal with private repositories which they have special credentials for.)

As far as I know, Git unfortunately has no configuration option or command line option that is equivalent to OpenSSH's 'batch mode' for ssh, where it will never prompt you for password challenges and will instead just fail. The closest you can come is setting core.askPass to something that generates output (such as 'echo'), in which case Git will try to authenticate with that bogus information, fail, and complain much more verbosely, which is not the same thing (among other issues, it causes the Git host to see you as trying invalid login credentials, which may have consequences).

If you're running your 'git pull' invocations from a script, as I often am, you can have the script set 'GIT_TERMINAL_PROMPT=0' (and export it into the environment). According to the documentation, this causes Git to fail rather than prompting you for anything, including authentication. It seems somewhat dangerous to set this generally in my environment, since I have no idea what else Git might someday want to prompt me about (and obviously if you need to sometimes get prompted you can't set this). Apparently this is incomplete if you fetch Git repositories over SSH, but I don't do that for public repositories that I track.

(I found this environment variable along with a lot of other discussion in this serverfault question and its answers.)

Some environments that run git behind the scenes, such as the historical 'go get' behavior, default to disabling git prompts. If you use such an environment it may have already handled this for you.

GitNeverAuthPrompts written at 23:11:21; Add Comment

2024-04-08

Don't require people to change 'source code' to configure your programs

Often, programs have build time configuration settings for features they include, paths they use, and so on. Some of the time, people suggest that the way to handle these is not through systems like 'configure' scripts (whether produced by Autoconf or some other means) but instead by having people edit their settings into things such as your Makefiles or header files ('source code' in a broad sense). As someone who has spent a bunch of time and effort building other people's software over the years, my strong opinion is that you should not do this.

The core problem of this approach is not that you require people to know the syntax of Makefiles or config.h or whatever in order to configure your software, although that's a problem too. The core problem is you're having people modify files that you will also change, for example when you release a new version of your software that has new options that you want people to be able to change or configure. When that happens, you're making every person who upgrades your software deal with merging their settings into your changes. And merging changes is hard and prone to error, especially if people haven't kept good records of what they changed (which they often won't if your configuration instructions are 'edit these files').

One of the painful lessons about maintaining systems that we've learned over the years is that you really don't want to have two people changing the same file, including the software provider and you. This is the core insight behind extremely valuable modern runtime configuration features such as 'drop-in files' (where you add or change things by putting your own files into some directory, instead of everything trying to update a common file). When you tell people to configure your program by editing a header file or a Makefile or indeed any file that you provide, you're shoving them back into this painful past. Every new release, every update they pull from your VCS, it's all going to be a source of pain for them.

A system where people maintain (or can maintain) their build time configurations entirely outside of anything you ship is far easier for people to manage. It doesn't matter exactly how this is implemented and there are mny options for relatively simple systems; you certainly don't need GNU Autoconf or even CMake.

The corollary to this is that if you absolutely insist on having people configure your software by editing files you ship, those files should be immutable by you. You should ship them in some empty state and promise never to change that, so that people building your software can copy their old versions from their old build of your software into your new release (or never get a merge conflict when they pull from your version control system repository). If your build system can't handle even this restriction, then you need to rethink it.

ConfigureNoSourceCodeChanges written at 22:16:54; Add Comment

(Previous 10 or go back to April 2024 at 2024/04/06)

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.