2023-04-22
The two types of C programmers (a provocative thesis)
Here is a provocative thesis:
There are two types of C programmers: people who chose C because they liked various of its properties, or people who used C because it was their best or only option at the time.
Back in the days, C was somewhere between your best option or your only real option for doing certain sorts of programming. If you were writing a Unix program, for example, for quite a while C was your only real choice (then later you could consider C++). The people who came to C often found some of its virtues to be attractive, but they weren't necessarily strongly committed to it; they'd picked C as an expedient choice.
Meanwhile, there are people who looked at C and felt (and often still feel) that it was very much the language for them (out of those available at the time). They feel strongly drawn to C's virtues, often explicitly in contrast to other languages, and today they may still program in C out of choice. If and when they switch languages they often pick languages that are as close to the virtues of C (as each person sees them) as possible.
I am the first sort of C programmer. I like some aspects of C but there are others that I more or less always found to be kind of a pain; as I've kind of said before, I no longer want to have to think about memory management and related issues. So I've wound up mostly in the Go camp, despite Go's garbage collection being anathema to a certain sort of C programmer.
(Thinking about memory management can be fun every so often, just as it can be fun to optimize anything, but I want it to be an optimization, not a mandatory thought.)
My perception of the second sort of C programmer is that if they've moved to any more recent mainstream language, it's probably Rust. Rust is certainly not C-like in some respects, but in a lot of ways its virtues are the most C-like out of all of the mainstream languages (and some of the ways it's better than C are important).
(Like all provocative theses, this is a generalization and simplification. People had and have many reasons for choosing C. And yes, there are non-mainstream languages that are trying to be 'a better C' in ways that are significantly different from Rust.)
2023-04-19
An interesting mistake I made with a (Go) SSH client API
We have a custom system for NFS mount authentication on our Linux fileservers that works, in part, by having a SSH client connect to would-be NFS clients to verify their SSH host key. In the process of writing the code for this, I made an interesting mistake that is fundamentally enabled by a long-standing OpenSSH naming confusion.
What you have in a SSH known hosts file is a list of (public) keys, each of which has a key type like 'ssh-rsa', 'ssh-ed25519', 'ecdsa-sha2-nistp256', and so on. However, what you use in the protocol is a (host) key algorithm. When you make a SSH connection to a server (in golang.org/x/crypto/ssh), you supply both a key and the host key algorithm(s) to use with it (obviously you need the key type to match the algorithm(s)).
For a long time, the name of the key types were exactly the names of their key algorithms; you had 'ssh-ed25519' keys and a 'ssh-ed25519' key exchange algorithm, for example. In both the protocol and typical APIs for dealing with it (including Go), these are stringly typed; for example, in Go the ssh.ClientConfig's HostKeyAlgorithms field is an array of strings. This makes it natural to write code that finds the type of a host key and sets it as the allowed host key algorithm. This is more or less exactly what my code did for a long time.
Then OpenSSH's defaults changed to not use the "ssh-rsa" key algorithm because it uses SHA1, which is now too weak of a cryptographic hash. You can still use 'ssh-rsa' keys, but you need the new host key algorithms of "rsa-sha2-256" or "rsa-sha2-512". If your code has not been updated and still does the straightforward old thing, you will take ssh-rsa keys, ask for 'ssh-rsa' as the host key algorithm, your modern OpenSSH based servers will say 'we don't support that', and you will be sad and perhaps surprised.
(If you have both ssh-rsa and ssh-ed25519 keys for most but not all hosts, your surprise and sadness may be deferred until the first of the exceptions is upgraded to a modern Ubuntu version.)
You can criticize the API here for being stringly typed, but I think that it's actually natural to do something like that, especially if you're initially designing the API in the old world, before "rsa-sha2-256" and when "ssh-rsa" was the (only) key algorithm you used with 'ssh-rsa' keys. In that world, the official names of key types and key algorithms were the same; making them two separate programming types and forcing people to explicitly convert between them is likely to strike people as perverse. Do you want to write or even have to use a function that converts 'keytype.Ed25519' into 'keyalgo.Ed25519'? Most people are going to say no. Just call it 'Ed25519' and be done.
(One implication of this is that what is a good API depends on when it's designed. An API that was good when it was designed, when SSH key types did map one to one with key algorithms, can retroactively become a not-good API later, when some key types now map differently.)
PS: I was lucky in that my code was structured to accumulate a list of 'key types', which were really 'key algorithms', so I could just update it to add some more key algorithms if we hit a 'ssh-rsa' key. If I'd had a slightly different code structure I might have had to do a more significant restructuring.
2023-04-10
Failing to build a useful pre Go 1.21 static Go toolchain on Linux
Recently I wrote about how Go 1.21 will have a static toolchain
on Linux, where the 'go
' program will
be statically linked so you can freely copy even a locally built
version from Linux distribution to Linux distribution. If you're
an innocent person, like I was before I started my journal, you might think
that achieving this yourself in Go 1.20 and earlier isn't hard. In
fact, it turns out that I failed, although my failure was disguised
by the situation in Go 1.21, where you
get a fully working static 'go
' binary regardless of what you do
and whether or not it had any actual effect on the build process.
In general, there are two easy ways to get a statically linked
normal Go program, if your Go program uses only the core std
packages. First, you can build with
'(CGO_ENABLED=0))' in your environment, which completely disables
use of CGO and with it dynamically linking your Go program. Second,
you can use '-tags osusergo,netgo
' to select the pure-Go versions
of the two standard packages that normally cause your Go program
to be dynamically linked by surprise.
Unfortunately, neither of these ways really works with building the
Go toolchain itself.
The easier failure is with build tags, because there's no way to pass build tags into the normal way to build Go from source. You can pass arguments to 'go tool compile', but this doesn't let you set build tags; as far as I can tell, those are controlled at a different level in the build process, in the selection of what files to compile as part of a package (see also). By the time source files are being compiled (what 'go tool compile' does), it's too late.
If you build with 'CGO_ENABLED=0
' the result works in that you'll
get a statically linked Go toolchain and you can compile normal Go
programs. However, your newly built Go toolchain will never build
CGO-enabled Go executables, even if it normally would (for example if
you build a Go program using net
without
setting 'netgo'). This is certainly not how you normally want a Go
toolchain to behave and it may give you real problems if you want
to build programs that require CGO to work.
The third way to build statically linked Go programs is to set the
'go tool link' flags that tell it
to create a static executable using the external linker, which are
'-extldflags=-static -linkmode=external
'. What this does is
instruct Go to ask the system linker ('ext[ernal] ld') to build a
static executable. In a 'go build
' command line, you pass this
with '-ldflags="..."'; when building Go itself you set this in the
'GO_LDFLAGS
' environment variable. This works, but in practice
it may not do what you want, because you can't usefully statically
link a program that looks up hostnames through glibc. The 'go
' toolchain needs
to look up hostnames to fetch packages, and if built without the
'netgo' tag it may try to do this through glibc, and then you need
the exact version of glibc.
(It's possible to get away with this at runtime if your nsswitch.conf is straightforward enough that Go will use its internal Go-based lookup functions, so static linking can be a step forward.)
One of the things all of this investigation has shown me is that having a statically linked Go toolchain in Go 1.21 was probably not a trivial change. That may partially explain why it wasn't done earlier.
PS: As I've found out, these days you probably have to set '-linkmode=external' in order for '-extldflags=-static' to do anything, because Go mostly seems to use its 'internal' linker. If there is a way to make Go's internal linker create static executables that are linked against glibc, I don't know what it is (and it's not the 'go tool link' -d argument). Given all of the issues with statically linking executables on Linux (and other systems), I suspect that there just isn't one.
2023-04-07
Go 1.21 will (likely) have a static toolchain on Linux
A while back, I lamented on the Fediverse:
Current status: yak shaving a Go 1.17 built on Ubuntu 20.04 so I can build Go 1.20 on 20.04 so I can build a binary with Go 1.20 that will run on 20.04 for reasons.
The easy way to solve this problem would have been to download an official binary release tarball, because these are built so that they'll run on pretty much any Linux (presumably on a system with a very old glibc, since they're actually dynamically linked, with glibc symbol versioning only requiring 2.3.2 or later). Because I already had a whole set of Go source trees, I picked the hard way.
At this point you might wonder why the Go toolchain is dynamically linked against the system glibc. Although I haven't tried to analyze symbol usage, the obvious assumption is that it's dynamically linked because various Go tools want to download packages over the network, which requires looking up DNS names, which is a very common cause of dynamically linking to glibc.
The good news, as pointed out to me by @magical, is that in Go 1.21 and later the plan is for the compiler to be built using the pure Go resolver only and to be a static executable. Relevant reading here is apparently issue #53862 and issue #57007 (via). As far as I know, the elements of this plan have already landed in the Go development version; my current development Go binaries are static binaries.
; ./go version go version devel go1.21-66cac9e1e4 Fri Apr 7 23:34:21 2023 +0000 linux/amd64 ; ldd ./go not a dynamic executable
Unless the Go developers revert this for some reason, Go 1.21 and later will be static executables on Linux.
(Doing this before Go 1.21 is tricky for reasons beyond the scope of this entry.)
This is a nice little quality of life improvement for people (like me) mostly working on recent Linuxes but who periodically have to deal with older ones. It won't automatically make your own programs version-independent, but for them you can use '-tags osusergo,netgo' when you build or install Go programs.
(You might wonder why I didn't just build the Go program I needed to run on Ubuntu 20.04 with those flags in the first place. The answer is that I was distracted by the flow of circumstances. First I tried to run the program on a 20.04 machine in addition to some 22.04 ones, and got glibc version errors, so I tried to rebuild it on 20.04 to be more universal, then I had the 'go' compiler toolchain not work with the same problem, and by that point my mental focus was on 'make the compiler toolchain work'.)