Wandering Thoughts


That people produce HTML with string templates is telling us something

A while back I had a hot take on the Fediverse:

Another day, another 'producing HTML with string templates/interpolation is wrong' article. People have been writing these articles for a decade or more and people doing web development have kept voting with their feet, which is why we have string templates everywhere. At this point, maybe people should consider writing about why things have worked out this way.

(I don't think it's going to change, either. No one has structured HTML creation that's as easy as string templates.)

One of my fundamental rules of system design is when people keep doing it wrong, the people are right and your system or idea is wrong. A corollary to this is that when you notice this happening, a productive reaction is to start asking questions about why people do it the 'wrong' way. Despite what you might expect from its title, Hugo Landau's [[Producing HTML using string templates has always been the wrong solution (via) actually has some ideas and pointers to ideas, for instance this quote from Using type inference to make web templates robust against XSS:

Strict structural containment is a sound, principled approach to building safe templates that is a great approach for anyone planning a new template language, but it cannot be bolted onto existing languages though because it requires that every element and attribute start and end in the same template. This assumption is violated by several very common idioms, such as the header-footer idiom in ways that often require drastic changes to repair.

Another thing to note here is that pretty much every programming language has a way to format strings, and many of them have ways to have multi-line strings. This makes producing HTML via string formatting something that scales up (and down) very easily; you can use the same idiom to format a small snippet as you would a large block. Even Go's html/template package doesn't scale down quite that far, although it comes close. String templating is often very close to string formatting and so probably fits naturally into how programmers are acclimatized to approach things.

(Hugo Landau classifies Go's html/template as 'context aware autoescaping HTML string templating', and considers it not as good as what the quote above calls 'strict structural containment' that works on the full syntax tree of the HTML document.)

I don't have any particular answers to why string templating has been enduringly popular so far (although I can come up with theories, including that string templating is naturally reusable to other contexts, such as plain text). But that it has suggests that people see string templating as having real advantages over their alternatives and those advantages keep being compelling, including in new languages such as Go (where the Go authors created html/template instead of trying to define a 'strict structural containment' system). If people want to displace string templating, figuring out what those current advantages are and how to duplicate them in alternatives seems likely to be important.

(I'll pass on the question of how important it is to replace the most powerful context aware autoescaping HTML string templating with something else.)

OnHTMLViaStringTemplates written at 22:28:59; Add Comment


Some notes on the cost of Go finalizers (in Go 1.20)

I recently read Daniel Lemire's The absurd cost of finalizers in Go (via), which reports on a remarkably high cost of using a finalizer to insure that C memory is freed. Lemire's numbers aren't atypical; in my own testing in a different environment I found a rough factor of ten difference between directly calling C malloc() and free() and using a finalizer to call free().

The first reason for this increased overhead in Lemire's test case is perhaps somewhat surprising, which is that using a finalizer forces heap allocation, while Lemire's non-finalizer version does not. Suppose that you have:

func Allocate() *C.char {
  return C.allocate()

func Free(c *C.char) {

// in a _test.go file
func BenchmarkAllocate(b *testing.B) {
  for j := 0; j < b.N; j++ {
    p := Allocate()

Go 1.20 is smart enough to allocate 'p' on the Go stack, so while the C code is calling malloc() and free(), Go is not doing anything with its own memory system. The moment you call runtime.SetFinalizer() this changes; Go considers the object you're trying to finalize to escape, so it allocates it in the heap. Probably this often won't matter in real situations, because what you're finalizing is already going to be heap allocated.

(In Lemire's test code, you can see this if you use 'go test -benchmem -bench=Benchmark -run -'; some of the benchmarks will allocate nothing per invocation, and others will allocate one thing.)

Lemire tested with garbage collection (GC) turned off in the Go runtime and got similar results, so theorized that SetFinalizer() was the expensive portion. I constructed a synthetic test function that only set a finalizer without making any cgo calls, and this does seem to be the case. With Go's GC on in its normal state, over 50% of the runtime of benchmarking this function is in SetFinalizer(), mostly in an internal runtime function called runtime.addspecial(). There are some other surprises, though. In total, GC activity seems to be about 27% of the runtime, with about half of that being directly triggered by allocations and half happening in the background. Much of the GC time seems to be spent processing and running finalizers, even though the test's finalizer does nothing (21% of the total time). A surprisingly high percentage of the time is spent locking and unlocking things, with the Go profiler attributing 10% to 'runtime.lock2()' and 10% to 'runtime.unlock2()'.

What I take from this is that SetFinalizer() is probably not considered something that you should use heavily, and as a result it hasn't been heavily optimized. You can get a sense of this from the extensive documentation around its limitations and issues in the runtime.SetFinalizer() documentation; using it correctly is tricky, and correctly using anything with a finalizer attached is also tricky (see the discussion of the example with file descriptors).

PS: One of the effects of putting finalizers on objects is that the objects will take longer to be garbage collected (an unused object with a finalizer takes two GC cycles to collect, instead of one). This may affect how you structure objects and where you attach finalizers; you probably don't want to put a finalizer on a big object or on an object that will be directly embedded in one (since Go doesn't free sub-objects by themselves).

Sidebar: My finalizer-only test code

In case people want to run their own tests:

// Used by Lemire's other benchmarks
type Cstr struct {
  cpointer *C.char

// No C malloc, no finalizer code
func EmptyFinalizer() *Cstr {
  answer := &Cstr{}
  runtime.SetFinalizer(answer, func(c *Cstr) {})
  return answer

// in _test file
func BenchmarkEmptyFinalizer(b *testing.B) {
  for j := 0; j < b.N; j++ {

I deliberately structured this to be as close to Lemire's other benchmark test functions as possible, hence its use of the Cstr type.

GoFinalizerCostsNotes written at 22:45:40; Add Comment


Why I use separate lexers in my recursive descent parsers

I recently read Laurence Tratt's Why Split Lexing and Parsing Into Two Separate Phases?, which answers the question its title asked. In the introduction, Tratt offhandedly remarks:

[...] Since some parsing approaches such as recursive descent parsing unify these phases, why do lex and yacc split them apart?

As someone who's written a number of recursive descent (RD) parsers over the years, I had a reaction on the Fediverse. You see, when I write recursive descent parsers, I always write a separate lexer.

The reason why is pretty straightforward; I find it simpler to have a separate lexer. In the kind of little languages that I wind up writing (RD) parsers for, the lexer is generally simple and straightforward, often implemented with either a few regular expressions or some state machines (sometimes nested), and operates by breaking the input text into tokens. The parser then gets the simplification of dealing only with tokens without having to concern itself with exactly how they're represented in text.

(I have a general process for writing a recursive descent parser that goes well with this separation, but that in no way fits into the margins of this entry; here's the short version.)

One slightly subtle advantage of the split is that it's easier to test a lexer on its own, and in turn this tends to make it somewhat easier to test the parser. A fairly simple set of tests (for example) can give me pretty good lexer coverage and protect me against making stupid mistakes. Then testing and debugging the parser becomes much simpler because I don't have to worry that what looks like a parsing mistake is actually a lexing mistake.

(Typically I cheat by writing parser tests with the assumption that the lexer is working correctly, so I can get the token input for the parser by lexing (test) text. This is impure but much simpler than hand-crafting token sequences.)

Although this feels like the obvious, simplest approach to me, it turns out that it seems to be far from universal. Obviously we have Tratt's experiences (and Tratt has much more of it than me), plus my Fediverse poll said that over half the people who write RD parsers use integrated lexing at least some of the time, and some of them always do it.

RecursiveDescentParsingAndLexing written at 22:29:13; Add Comment


The two types of C programmers (a provocative thesis)

Here is a provocative thesis:

There are two types of C programmers: people who chose C because they liked various of its properties, or people who used C because it was their best or only option at the time.

Back in the days, C was somewhere between your best option or your only real option for doing certain sorts of programming. If you were writing a Unix program, for example, for quite a while C was your only real choice (then later you could consider C++). The people who came to C often found some of its virtues to be attractive, but they weren't necessarily strongly committed to it; they'd picked C as an expedient choice.

Meanwhile, there are people who looked at C and felt (and often still feel) that it was very much the language for them (out of those available at the time). They feel strongly drawn to C's virtues, often explicitly in contrast to other languages, and today they may still program in C out of choice. If and when they switch languages they often pick languages that are as close to the virtues of C (as each person sees them) as possible.

I am the first sort of C programmer. I like some aspects of C but there are others that I more or less always found to be kind of a pain; as I've kind of said before, I no longer want to have to think about memory management and related issues. So I've wound up mostly in the Go camp, despite Go's garbage collection being anathema to a certain sort of C programmer.

(Thinking about memory management can be fun every so often, just as it can be fun to optimize anything, but I want it to be an optimization, not a mandatory thought.)

My perception of the second sort of C programmer is that if they've moved to any more recent mainstream language, it's probably Rust. Rust is certainly not C-like in some respects, but in a lot of ways its virtues are the most C-like out of all of the mainstream languages (and some of the ways it's better than C are important).

(Like all provocative theses, this is a generalization and simplification. People had and have many reasons for choosing C. And yes, there are non-mainstream languages that are trying to be 'a better C' in ways that are significantly different from Rust.)

CProgrammersTwoTypes written at 22:20:25; Add Comment


An interesting mistake I made with a (Go) SSH client API

We have a custom system for NFS mount authentication on our Linux fileservers that works, in part, by having a SSH client connect to would-be NFS clients to verify their SSH host key. In the process of writing the code for this, I made an interesting mistake that is fundamentally enabled by a long-standing OpenSSH naming confusion.

What you have in a SSH known hosts file is a list of (public) keys, each of which has a key type like 'ssh-rsa', 'ssh-ed25519', 'ecdsa-sha2-nistp256', and so on. However, what you use in the protocol is a (host) key algorithm. When you make a SSH connection to a server (in golang.org/x/crypto/ssh), you supply both a key and the host key algorithm(s) to use with it (obviously you need the key type to match the algorithm(s)).

For a long time, the name of the key types were exactly the names of their key algorithms; you had 'ssh-ed25519' keys and a 'ssh-ed25519' key exchange algorithm, for example. In both the protocol and typical APIs for dealing with it (including Go), these are stringly typed; for example, in Go the ssh.ClientConfig's HostKeyAlgorithms field is an array of strings. This makes it natural to write code that finds the type of a host key and sets it as the allowed host key algorithm. This is more or less exactly what my code did for a long time.

Then OpenSSH's defaults changed to not use the "ssh-rsa" key algorithm because it uses SHA1, which is now too weak of a cryptographic hash. You can still use 'ssh-rsa' keys, but you need the new host key algorithms of "rsa-sha2-256" or "rsa-sha2-512". If your code has not been updated and still does the straightforward old thing, you will take ssh-rsa keys, ask for 'ssh-rsa' as the host key algorithm, your modern OpenSSH based servers will say 'we don't support that', and you will be sad and perhaps surprised.

(If you have both ssh-rsa and ssh-ed25519 keys for most but not all hosts, your surprise and sadness may be deferred until the first of the exceptions is upgraded to a modern Ubuntu version.)

You can criticize the API here for being stringly typed, but I think that it's actually natural to do something like that, especially if you're initially designing the API in the old world, before "rsa-sha2-256" and when "ssh-rsa" was the (only) key algorithm you used with 'ssh-rsa' keys. In that world, the official names of key types and key algorithms were the same; making them two separate programming types and forcing people to explicitly convert between them is likely to strike people as perverse. Do you want to write or even have to use a function that converts 'keytype.Ed25519' into 'keyalgo.Ed25519'? Most people are going to say no. Just call it 'Ed25519' and be done.

(One implication of this is that what is a good API depends on when it's designed. An API that was good when it was designed, when SSH key types did map one to one with key algorithms, can retroactively become a not-good API later, when some key types now map differently.)

PS: I was lucky in that my code was structured to accumulate a list of 'key types', which were really 'key algorithms', so I could just update it to add some more key algorithms if we hit a 'ssh-rsa' key. If I'd had a slightly different code structure I might have had to do a more significant restructuring.

SSHClientKeyTypeMistake written at 22:45:01; Add Comment


Failing to build a useful pre Go 1.21 static Go toolchain on Linux

Recently I wrote about how Go 1.21 will have a static toolchain on Linux, where the 'go' program will be statically linked so you can freely copy even a locally built version from Linux distribution to Linux distribution. If you're an innocent person, like I was before I started my journal, you might think that achieving this yourself in Go 1.20 and earlier isn't hard. In fact, it turns out that I failed, although my failure was disguised by the situation in Go 1.21, where you get a fully working static 'go' binary regardless of what you do and whether or not it had any actual effect on the build process.

In general, there are two easy ways to get a statically linked normal Go program, if your Go program uses only the core std packages. First, you can build with '(CGO_ENABLED=0))' in your environment, which completely disables use of CGO and with it dynamically linking your Go program. Second, you can use '-tags osusergo,netgo' to select the pure-Go versions of the two standard packages that normally cause your Go program to be dynamically linked by surprise. Unfortunately, neither of these ways really works with building the Go toolchain itself.

The easier failure is with build tags, because there's no way to pass build tags into the normal way to build Go from source. You can pass arguments to 'go tool compile', but this doesn't let you set build tags; as far as I can tell, those are controlled at a different level in the build process, in the selection of what files to compile as part of a package (see also). By the time source files are being compiled (what 'go tool compile' does), it's too late.

If you build with 'CGO_ENABLED=0' the result works in that you'll get a statically linked Go toolchain and you can compile normal Go programs. However, your newly built Go toolchain will never build CGO-enabled Go executables, even if it normally would (for example if you build a Go program using net without setting 'netgo'). This is certainly not how you normally want a Go toolchain to behave and it may give you real problems if you want to build programs that require CGO to work.

The third way to build statically linked Go programs is to set the 'go tool link' flags that tell it to create a static executable using the external linker, which are '-extldflags=-static -linkmode=external'. What this does is instruct Go to ask the system linker ('ext[ernal] ld') to build a static executable. In a 'go build' command line, you pass this with '-ldflags="..."'; when building Go itself you set this in the 'GO_LDFLAGS' environment variable. This works, but in practice it may not do what you want, because you can't usefully statically link a program that looks up hostnames through glibc. The 'go' toolchain needs to look up hostnames to fetch packages, and if built without the 'netgo' tag it may try to do this through glibc, and then you need the exact version of glibc.

(It's possible to get away with this at runtime if your nsswitch.conf is straightforward enough that Go will use its internal Go-based lookup functions, so static linking can be a step forward.)

One of the things all of this investigation has shown me is that having a statically linked Go toolchain in Go 1.21 was probably not a trivial change. That may partially explain why it wasn't done earlier.

PS: As I've found out, these days you probably have to set '-linkmode=external' in order for '-extldflags=-static' to do anything, because Go mostly seems to use its 'internal' linker. If there is a way to make Go's internal linker create static executables that are linked against glibc, I don't know what it is (and it's not the 'go tool link' -d argument). Given all of the issues with statically linking executables on Linux (and other systems), I suspect that there just isn't one.

GoToolchainStaticBuildFailure written at 22:22:57; Add Comment


Go 1.21 will (likely) have a static toolchain on Linux

A while back, I lamented on the Fediverse:

Current status: yak shaving a Go 1.17 built on Ubuntu 20.04 so I can build Go 1.20 on 20.04 so I can build a binary with Go 1.20 that will run on 20.04 for reasons.

The easy way to solve this problem would have been to download an official binary release tarball, because these are built so that they'll run on pretty much any Linux (presumably on a system with a very old glibc, since they're actually dynamically linked, with glibc symbol versioning only requiring 2.3.2 or later). Because I already had a whole set of Go source trees, I picked the hard way.

At this point you might wonder why the Go toolchain is dynamically linked against the system glibc. Although I haven't tried to analyze symbol usage, the obvious assumption is that it's dynamically linked because various Go tools want to download packages over the network, which requires looking up DNS names, which is a very common cause of dynamically linking to glibc.

The good news, as pointed out to me by @magical, is that in Go 1.21 and later the plan is for the compiler to be built using the pure Go resolver only and to be a static executable. Relevant reading here is apparently issue #53862 and issue #57007 (via). As far as I know, the elements of this plan have already landed in the Go development version; my current development Go binaries are static binaries.

; ./go version
go version devel go1.21-66cac9e1e4 Fri Apr 7 23:34:21 2023 +0000 linux/amd64
; ldd ./go
        not a dynamic executable

Unless the Go developers revert this for some reason, Go 1.21 and later will be static executables on Linux.

(Doing this before Go 1.21 is tricky for reasons beyond the scope of this entry.)

This is a nice little quality of life improvement for people (like me) mostly working on recent Linuxes but who periodically have to deal with older ones. It won't automatically make your own programs version-independent, but for them you can use '-tags osusergo,netgo' when you build or install Go programs.

(You might wonder why I didn't just build the Go program I needed to run on Ubuntu 20.04 with those flags in the first place. The answer is that I was distracted by the flow of circumstances. First I tried to run the program on a 20.04 machine in addition to some 22.04 ones, and got glibc version errors, so I tried to rebuild it on 20.04 to be more universal, then I had the 'go' compiler toolchain not work with the same problem, and by that point my mental focus was on 'make the compiler toolchain work'.)

Go121LinuxStaticToolchain written at 23:20:23; Add Comment


Moving from 'master' to 'main' in Git with local changes

One of the things that various open source Git repositories are doing is changing their main branch from being called 'master' to being called 'main'. As a consumer of their repository, this is generally an easy switch for me to deal with; some day, I will do a 'git pull', get a report that there's a new 'main' branch but there's no upstream 'master', and then I'll do 'git checkout main' and I'm all good. However, with some repositories I have my own local changes, which I handle through Git rebasing. Recently I had to go through a 'master' to 'main' switch on such a repository, so I'm writing down what I did for later use.

The short version is:

git checkout main
git cherry-pick origin/master..master

(This is similar to something I did before with Darktable.)

In general I could have done this with either 'git rebase' or 'git cherry-pick', and in theory according to my old take on rebasing versus cherry-picking the 'proper' answer might have been a rebase, since I was moving my local commits onto the new 'main' branch. However it was clear to me that I would probably have wanted to use the full three-argument form of 'git rebase', which is at least somewhat tricky to understand and to be sure I was doing right. Cherry-picking was much simpler; I could easily reason about what it was doing, and it left my old 'master' state alone in case.

(Switching from rebasing to cherry-picking is an experience I've had before.)

Now that I've written this I've realized that there was probably a third way, because at a mechanical level branches in git don't entirely exist. The upstream 'master' and 'main' branches cover the same commits (up until possibly the 'main' branch adds some on top). The only thing that says my local changes are on 'master' instead of 'main' is a branch head. In theory, what I could have done was just relabeling my current state as being on 'main' instead of 'master', and then possibly a 'git pull' to be current with the new 'main'.

(In the case of this particular repository, it was only a renaming of the main branch; upstream, both the old 'master' and the new 'main' are on the same commit.)

Since I just tried it on a copy of my local repository in question, the commands to do this are:

git branch --force main master
git checkout main
# get up to date:
git pull

I believe that you only need the pull if the upstream main is ahead of the old upstream master.

This feels more magical than the rebase or cherry-pick version, so I'm probably not likely to use it in the future unless there's some oddity about the situation. One potential reason would be if I've published my repository, I don't expect upstream development (just the main branch being renamed), and other people might have changes on top of my changes. At that point, a cherry-pick (or a rebase) would change the commit hashes of my changes, while simply sticking the 'main' branch label on to them doesn't, so people who have changes on top of my changes might have an easier time.

GitMasterToMainWithLocalChanges written at 22:07:55; Add Comment


An unexciting idea: Code changes have context

I recently read Mark Dominus's I wish people would stop insisting that Git branches are nothing but refs (via). One of my thoughts afterward is that this feels like an instance of a broader thing, which is that (code) changes have context; here, one part of that context is where they happen (ie, what branch they happen on). Of course we already know that in a sense, because Git (and pretty much every other version control system) considers it important to record both who made the change and when it was made.

In a way, it is turtles all of the way down. It's not too wrong to say that in Git, the core objects are trees. Changes (commits) are a record of the relationship between trees; they give you the context of moving from one tree to another (often partially literally in the form of the commit message and anything it points you to). Our desire for this context is one reason people emphasize that you should write good commit messages. In a sense, diffs themselves are an expression of that context, since they are literally what changed between the two (or more) trees involved (although diffs by themselves aren't necessarily enough context).

When we move one more level up, branches are one expression of the context of changes (commits) themselves. Branches generally have some sort of meaning, and they also represent (are) separate sequences of changes; that separation adds context to the changes themselves, although for more context you need to know what the branches are. Of course branches aren't the only way of adding context to changes (there are many ways of putting it into commit messages). Nor are they the only context to changes we care about, since sometimes we care if particular changes are in a release or in a version that someone is running.

(The question of 'has this change been merged into the main branch' is an interesting edge case. Here, we do care about the state of a change in the context of a branch, but it's not the branch the change was initially created in. Knowing that the whole branch was merged into the main branch would only be helpful if you knew that the branch didn't continue on beyond that merge.)

A corollary to this is that you'll forget this context over time. This makes me feel that it's worth putting as much of it as possible in a durable and accessible form, which probably means the commit message (since that's often the most accessible place). Code comments can help, but they're only attached to the new state so it may take some contortions to discuss the change. I've sometimes engaged in this when I think it's important enough (or where I may not think to find and look back at a commit message), but putting dates and discussions of how the old state used to be in comments feels somewhat wrong.

(I suspect that all of this is obvious, but Mark Dominus's article crystalized this in my mind so I feel like writing it down.)

ChangesHaveContext written at 22:51:23; Add Comment


The case for atomic types in programming languages

In My review of the C standard library in practice Chris Wellons says, as part of the overall discussion:

I’ve used the _Atomic qualifier in examples since it helps with conciseness, but I hardly use it in practice. In part because it has the inconvenient effect of bleeding into APIs and ABIs. As with volatile, C is using the type system to indirectly achieve a goal. Types are not atomic, loads and stores are atomic. [..]

Wellons is entirely correct here; at the CPU level (on current general purpose CPUs), specific operations are atomic, not specific storage locations. An atomic type is essentially a lie by the language. One language that originally embraced this reality is Go, which originally had no atomic types, only atomic access. On the other side is Rust; Rust has atomic types and a general discussion of atomic operation.

(In Go 1.19, Go added a number of atomic types, so now Go can be used with either approach.)

However, I feel that that the lie of atomic types is a genuine improvement in almost all cases, because of the increase in usability and safety. The problem with only having atomic operations is the same as with optional error checking; you have to remember to always use them, even if the types you're operating on can be used with ordinary operations. As we all know, people can forget this, or they can think that they're clever enough to use non-atomic operations in this one special circumstance that is surely harmless.

Like forced error handling (whether through exceptions or option/result types), having atomic types means that you don't have a choice. The language makes sure that they're always used safely, and you as the programmer are relieved of one more thing to worry about and try to keep track of. Atomic types may be a bit of a lie, but they're an ergonomic lie that improves safety.

The question of whether atomic types should be separate things (as in Rust) or be a qualifier on regular types (as in C) is something that you can argue over. It's clear that atomic types need extra operations because there are important atomic operations (like compare and swap) that have no non-atomic equivalent. I tend to think that atomic types should be there own thing because there are many standard operations that they can't properly support, at least not without potentially transforming simple code that you wrote into something much more complex. It's better to be honest about what atomic types can and can't do.

Sidebar: Why an atomic type qualifier has to bleed into APIs and ABIs

A type qualifier like C's _Atomic is a promise from the compiler to you that all (supported) operations on the object will be performed atomically. If you remove this qualifier as part of passing a variable around, you're removing this promise and now your atomic qualified thing might be accessed non-atomically. In other words, the atomic access is part of the API. It's not even necessarily safe to automatically promote a non-atomic object into being an atomic object as part of, say, a function call, because the code being called may reasonably assume that all access to the object is atomic.

CaseForAtomicTypes written at 23:05:49; Add Comment

(Previous 10 or go back to February 2023 at 2023/02/01)

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.