Union types ('enum types') would be complicated in Go

December 1, 2024

Every so often, people wish that Go had enough features to build some equivalent of Rust's Result type or Option type, often so that Go programmers could have more ergonomic error handling. One core requirement for this is what Rust calls an Enum and what is broadly known as a Union type. Unfortunately, doing a real enum or union type in Go is not particularly simple, and it definitely requires significant support by the Go compiler and the runtime.

At one level we easily do something that looks like a Result type in Go, especially now that we have generics. You make a generic struct that has private fields for an error, a value of type T, and a flag that says which is valid, and then give it some methods to set and get values and ask it which it currently contains. If you ask for a sort of value that's not valid, it panics. However, this struct necessarily has space for three fields, where the Rust enums (and generally union types) act more like C unions, only needing space for the largest type possible in them and sometimes a marker of what type is in the union right now.

(The Rust compiler plays all sorts of clever tricks to elide the enum marker if it can store this information in some other way.)

To understand why we need deep compiler and runtime support, let's ask why we can't implement such a union type today using Go's unsafe package to perform suitable manipulation of a suitable memory region. Because it will make the discussion easier, let's say that we're on a 64-bit platform and our made up Result type will contain either an error (which is an interface value) or an int64[2] array. On a 64-bit platform, both of these types occupy 16 bytes, since an interface value is two pointers in a trenchcoat, so it looks like we should be able to use the same suitably-aligned 16-byte memory area for each of them.

However, now imagine that Go is performing garbage collection. How does the Go runtime know whether or not our 16-byte memory area contains two live pointers, which it must follow as part of garbage collection, or two 64-bit integers, which it definitely cannot treat as pointers and follow? If we've implemented our Result type outside of the compiler and runtime, the answer is that garbage collection has no idea which it currently is. In the Go garbage collector, it's not values that have types, but storage locations, and Go doesn't provide an API for changing the type of a storage location.

(Internally the runtime can set and change information about what pieces of memory contain pointers, but this is not exposed to the outside world; it's part of the deep integration of runtime memory allocation and the runtime garbage collector.)

In Go, without support from the runtime and the compiler the best you can do is store an interface value or perhaps an unsafe.Pointer to the actual value involved. However, this probably forces a separate heap allocation for the value, which is less efficient in several ways that the compiler supported version that Rust has. On the positive side, if you store an interface value you don't need to have any marker for what's stored in your Result type, since you can always extract that from the interface with suitable type assertion.

The corollary to all of this is that adding union types to Go as a language feature wouldn't be merely a modest change in the compiler. It would also require a bunch of work in how such types interact with garbage collection, Go's memory allocation systems (which in the normal Go toolchain allocate things with pointers into separate memory arenas than things without them), and likely other places in the runtime.

(I suspect that Go is pretty unlikely to add union types given this, since you can have much of the API that union types present with interface types and generics. And in my view, union types like Result wouldn't be really useful without other changes to Go's type system, although that's another entry.)

PS: Something like this has come up before in generic type sets.

Written on 01 December 2024.
« Using systemd-run to limit something's memory usage in cgroups v2
Good union types in Go would probably need types without a zero value »

Page tools: View Source.
Search:
Login: Password:

Last modified: Sun Dec 1 23:31:34 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.