Go generics: the question of types made from generic types and type sets

February 13, 2022

In yesterday's entry on the 'any' confusion in Go generics, I brought up the fact that you can't create types (or variables of types) using generic types that are instantiated with type sets. That sounds abstract, so let's make it concrete:

type Result[T any] []T
type Toable[T any] interface {
  Fred(s string) T
  Bar(i int) T
type Addable interface {
  ~uint32 | ~float64

// The following are both invalid
var r2 Result[Toable]
var r3 Result[Addable]

Of course, you also can't declare plain variables of type sets:

// Also both invalid
var a Toable
var b Addable

There are a number of people who would like a future version of Go to support this. However, there are two important and very open questions about it: what it would mean, and how it would be implemented.

In the abstract, what b means is probably straightforward; most people would assume that it's a variable that can hold either a uint32 or a float64, since those are the two underlying types allowed by the type set Addable. What a means is less clear, since Toable is a parameterized type set. The most likely intended meaning is that it can hold a value of a type that satisfies Toable for some type T, which in this case is to say that it has 'Fred(s string)' and 'Bar(i int)' methods that return the same type.

The first open question is whether either a or b (or both) hold interface values, concrete values, or some new third type of value that would have to be invented and added to Go in a coherent way. In the case of a, it certainly seems natural for it to hold interface values, but people might not be happy if b held interface values, with the runtime indirection that that implies. But on the other hand, it seems that you'd need some sort of type assertion to extract concrete values even from b.

The next question is what can you do with a and b. For instance, consider:

// in a different package
var C Addable

// in our package
b = b + package.C

Is this valid? If it does, how can it even work? What happens if at runtime the underlying type of b is float64 and the underlying type of C is uint32?

We get into equally exciting problems with a:

var e int
d := a.Fred("test")
e = a.Bar(10)

What is the type of d and how is it determined? What happens if the actual type of a at runtime is Toable[string], and so the second method call is attempting to assign a string to an int?

One answer is that you can't do anything with a and b except type assert them into specific concrete types; you can't call methods on them (even if you know they have the methods because of their type set) or do arithmetic (even if you know the arithmetic is allowed in general because of their type set). I suspect that people would find this unsatisfactory. However, other answers seem likely to substantially increase the potential for runtime panics in ordinary looking Go code, which is also not a good thing.

This brings us to the implementation questions, especially since Go is a deliberately simple language that attempts to be straightforward and obvious in most of its concrete types (including interface values, which have a straightforward two pointer representation). Go prefers fixed size values with simple representations (although they can hide internal complexity behind pointers, as maps and channels do). It's difficult to see how this could be readily achieved without simply making all such type set values be interfaces and then probably forbidding any operations on them except type assertions.

If these type set values are to be represented as their concrete types, there are some big questions about storage allocations. For instance, consider the backing array for 'r3', which is a slice that holds either uint32 values (which take up 4 bytes) or float64 values (which take up 8 bytes). Does the backing array use an element size that can hold the largest possible element? How does anything tell which element is which type? Do we have to invent a new internal Go representation that contains a type tag of some sort? And people can create type sets that contain large structs among their options.

(One way this might happen naturally is if you're creating a generic type set that includes types from other packages, which you're treating as opaque. If they happen to be implemented as fixed size arrays or as structs, you could get a surprise.)

Next, consider a version of Addable that also allows ~string. Now r3 may contain a mixture of 8-byte elements, some of which are pointers (the string values) and some of which are not (the float64 values). How does the Go garbage collector sort that one out? It will need to have (and access) type information on a per-element basis.

(Interface values are simpler for the garbage collector and for the Go runtime in general because they have a uniform representation, regardless of what type of interface they are.)

There are almost certainly no answers here that don't make Go a more complicated language with a more complicated implementation. To the extent that some useful features might be extracted from allowing type sets to be used as types, I think that they should be built with other new mechanisms, mechanisms that specifically address the problems they're intended to solve.

(I believe that a popular problem is a desire for what Rust calls 'enum variants with associated data'. I think that people who want this in Go would be better served by specifically designing something for it, although I suspect it may be too complicated to make the Go developers happy.)

Written on 13 February 2022.
« The 'any' confusion in Go generics between type constraints and interfaces
Beware of trying to compare the size of subtrees with du »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Feb 13 23:09:09 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.