In Go, pointers (mostly) don't go with slices in practice

August 29, 2021

When I wrote about why it matters that map values are unaddressable in Go, there were a set of Twitter replies from Sean Barrett:

Knowing none of the details & not being a go programmer, I would have guessed that map values aren't addressable because they're in a dynamically-sized hash table so they need to get relocated behind the user's back; getting the address of a value slot would break that.

But I'd also have assumed Go has dynamically-extensible arrays, and the same argument would apply in that case, so maybe not?

This sparked an article about how Go maps store their values and keys, so today I'm writing about the second part of Barrett's reply, about "dynamically-extensible arrays", because the situation here in Go is peculiar (especially from the perspective of a C or C++ programmer trying to extend their intuitions to Go). Put simply, Go has pointers and it has something like dynamically extensible arrays, but in practice you can't use pointers to slices or slice elements. Trying to combine the two is a recipe for pain, confusion, and weird problems.

On the surface, things look straightforward. The Go version of dynamically extensible arrays are slices. Slices and elements of slices are among Go's addressable values, so both of the following pointer creations are legal:

var s []int
s = append(s, 10, 20, 30)
// pointer to a slice element
//  and the slice
pe := &s[0]
ps := &s

At this moment you can dereference pe and ps and get the results you expect, including if you modify s[0] with eg 's[0] = 100'. Where things go off the rails is if you do anything else with the slice s, such as:

s = append(s, 50)
// return the slice from a function
return s

There are two problems. The first problem, possibly exposed by the append(), is that slice elements actually live in an anonymous backing array. Modifying the size of a slice (such as by appending another element to it) may create a new version of this anonymous backing array, and when the array is reallocated, any pointers to the old one aren't updated to point to the new one and so won't see any changes to it. So if you have the following code:

pe = &s[0]
s = append(s, 50)
s[0] = 100

The value of '*pe' may or may not now be 100, depending on whether the append() created a new version of the backing array.

The second problem is that slices themselves are passed, returned, and copied by value, which doesn't quite do what you might think because slices are lightweight things. A slice is a length, a reference to the anonymous backing array, and a capacity. Copying the slice copies these three, but doesn't copy the anonymous backing array itself, which means that many slices can refer to the same anonymous backing array (and yes this can get confusing and create fun problems).

When you take a pointer to a slice, you get a pointer to the current version of this tuple of information for the slice. This pointer may or may not refer to a slice that anyone else is using; for instance:

ps := &s
s = append(s, 50)

At this point, '*ps' may or may not be the same thing as 's', and so it might or might not have the new '50' element at the end. The more time that passes between taking a pointer to a slice and the slice being further manipulated, the less likely it is that 'ps' points to anything useful. If the slice 's' is returned from a function the return is copy by value, and so ps definitely no longer points to the live slice that the caller is using, although ps might have the same length and refer to the same anonymous backing array.

Update: It's been pointed out that this isn't true in the limited example here. In Go, variables like s are storage locations, so although the append() may return a different slice value, this different value will overwrite the old one in s and the ps pointer will still point to the current version of the slice. However, this isn't the case if the append() happens in a different function (after you either return s or pass it to a function as an argument).

This leads to the situation mentioned on Twitter by Tom Cheng:

> regular GC keeps pointers to the old version alive if necessary.

Wait... what? so if i get a pointer into an array, then resize the array, then get a pointer to the same index, i'll have 2 valid pointers to 2 completely different objects??

(For 'array', read 'dynamically extensible array', so a slice. The answer is yes.)

It's possible to use pointers to slices or to slice elements in limited situations, if you're very careful with what you're doing with them (or know exactly what you're doing and why). But in general, pointers to slices and slice elements don't do what you want.

Honestly, this is a strange and peculiar situation, although Go programmers have acclimatized to it. To programmers from other languages, such as C or C++, the concept of pointers to dynamically extensible arrays seems like a perfectly decent idea that surely should exist and work in Go. Well, it exists, and it "works" in the sense that it yields results and doesn't crash your program, but it doesn't "work" in the sense of doing what you'd actually want.


Comments on this page:

I've definitely created bugs before by taking pointers to slice elements when I was trying to be clever.

I think conceptually, pointers in Go should be thought of as copy/reuse annotations. A plain value is copied when passed to a function or new variable, and a pointer reuses the original object when passed. In the case of a pointer to a slice element, you're telling it to reuse and copy the same thing, which leads to bugs when they get out of sync.

There are two solutions. One is to not pass a pointer, but an index, so instead of passing &s[i], you pass s and i. That works, but can be awkward in a lot of situations. The other is to make the slice a slice of pointer elements, so that the reuse annotations match, like s := make([]*int); s = append(s, new(int)). In most cases, the second approach is easier.

By @therealplato at 2021-08-30 11:40:57:

The workaround for the append risk is always use append to replace the original:

a = append(a, b)

Written on 29 August 2021.
« What my first Linux was, and its context
Some notes on OpenZFS's new 'draid' vdev redundancy type »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Aug 29 00:42:58 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.