Go generics: the question of types made from generic types and type sets

February 13, 2022

In yesterday's entry on the 'any' confusion in Go generics, I brought up the fact that you can't create types (or variables of types) using generic types that are instantiated with type sets. That sounds abstract, so let's make it concrete:

type Result[T any] []T
type Toable[T any] interface {
  Fred(s string) T
  Bar(i int) T
}
type Addable interface {
  ~uint32 | ~float64
}

// The following are both invalid
var r2 Result[Toable]
var r3 Result[Addable]

Of course, you also can't declare plain variables of type sets:

// Also both invalid
var a Toable
var b Addable

There are a number of people who would like a future version of Go to support this. However, there are two important and very open questions about it: what it would mean, and how it would be implemented.

In the abstract, what b means is probably straightforward; most people would assume that it's a variable that can hold either a uint32 or a float64, since those are the two underlying types allowed by the type set Addable. What a means is less clear, since Toable is a parameterized type set. The most likely intended meaning is that it can hold a value of a type that satisfies Toable for some type T, which in this case is to say that it has 'Fred(s string)' and 'Bar(i int)' methods that return the same type.

The first open question is whether either a or b (or both) hold interface values, concrete values, or some new third type of value that would have to be invented and added to Go in a coherent way. In the case of a, it certainly seems natural for it to hold interface values, but people might not be happy if b held interface values, with the runtime indirection that that implies. But on the other hand, it seems that you'd need some sort of type assertion to extract concrete values even from b.

The next question is what can you do with a and b. For instance, consider:

// in a different package
var C Addable

// in our package
b = b + package.C

Is this valid? If it does, how can it even work? What happens if at runtime the underlying type of b is float64 and the underlying type of C is uint32?

We get into equally exciting problems with a:

var e int
d := a.Fred("test")
e = a.Bar(10)

What is the type of d and how is it determined? What happens if the actual type of a at runtime is Toable[string], and so the second method call is attempting to assign a string to an int?

One answer is that you can't do anything with a and b except type assert them into specific concrete types; you can't call methods on them (even if you know they have the methods because of their type set) or do arithmetic (even if you know the arithmetic is allowed in general because of their type set). I suspect that people would find this unsatisfactory. However, other answers seem likely to substantially increase the potential for runtime panics in ordinary looking Go code, which is also not a good thing.

This brings us to the implementation questions, especially since Go is a deliberately simple language that attempts to be straightforward and obvious in most of its concrete types (including interface values, which have a straightforward two pointer representation). Go prefers fixed size values with simple representations (although they can hide internal complexity behind pointers, as maps and channels do). It's difficult to see how this could be readily achieved without simply making all such type set values be interfaces and then probably forbidding any operations on them except type assertions.

If these type set values are to be represented as their concrete types, there are some big questions about storage allocations. For instance, consider the backing array for 'r3', which is a slice that holds either uint32 values (which take up 4 bytes) or float64 values (which take up 8 bytes). Does the backing array use an element size that can hold the largest possible element? How does anything tell which element is which type? Do we have to invent a new internal Go representation that contains a type tag of some sort? And people can create type sets that contain large structs among their options.

(One way this might happen naturally is if you're creating a generic type set that includes types from other packages, which you're treating as opaque. If they happen to be implemented as fixed size arrays or as structs, you could get a surprise.)

Next, consider a version of Addable that also allows ~string. Now r3 may contain a mixture of 8-byte elements, some of which are pointers (the string values) and some of which are not (the float64 values). How does the Go garbage collector sort that one out? It will need to have (and access) type information on a per-element basis.

(Interface values are simpler for the garbage collector and for the Go runtime in general because they have a uniform representation, regardless of what type of interface they are.)

There are almost certainly no answers here that don't make Go a more complicated language with a more complicated implementation. To the extent that some useful features might be extracted from allowing type sets to be used as types, I think that they should be built with other new mechanisms, mechanisms that specifically address the problems they're intended to solve.

(I believe that a popular problem is a desire for what Rust calls 'enum variants with associated data'. I think that people who want this in Go would be better served by specifically designing something for it, although I suspect it may be too complicated to make the Go developers happy.)


Comments on this page:

By anthony at 2022-02-14 09:55:10:

I think that by allowing allowing type sets to be used as types, we will make the language even more complex than 1.18 will be (with generics introduction). And you know as well that generics can easily get you in the rabbit hole of adding "features" to them.

It seems to me the only workable approach for a slice over a typeset would be that all of the elements of the slice must be the same type (or interface) – no different from how slices already work. The slice as a whole can be a slice over one of the types in the constraint, or it can be a slice over another, but not a slice of individual elements each separately being any of the types that satisfy the constraint.

No?

By cks at 2022-02-15 14:19:59:

The problem with that restriction is that it creates a potential runtime panic in the following 'obviously looks right' code:

var res []Toable
var r1, r2 Toable
r1 = func1()
r2 = func2()
res = append(res, r1)
res = append(res, r2)

All of these variables are the same type set, so it certainly looks like you should be able to put r1 and r2 in the same slice, but there's no reason that their specific types will be the same; if the slice is restricted (at runtime) to a single specific type of elements, this has to panic. Here I'm assuming you can declare functions returning a type set instead of a specific type, but you can create this situation with global variables too.

For extra fun this code is fine if Toable is a type, including an interface type. This means that you have to know that Toable is a type set in order to understand that the code is dangerous; otherwise it looks like innocent code using regular Go types.

For extra fun this code is fine if Toable is a type, including an interface type.

Yeah, that’s a hard nut to crack. What I was thinking of wouldn’t allow for the code to be written like you showed, but not only (I realise now) would it be so onerous in practice as to be useless (which is bad enough), it would also significantly diminish this “obviously looks right” property of Go code at the same time. A steep price to pay in exchange for getting an unusable feature…

Written on 13 February 2022.
« The 'any' confusion in Go generics between type constraints and interfaces
Beware of trying to compare the size of subtrees with du »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Feb 13 23:09:09 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.