2018-09-27
Learning about Go's unaddressable values and slicing
Dave Cheney recently posted another little Go pop quiz on Twitter, and as usual I learned something interesting from it. Let's start with his tweet:
#golang pop quiz: what does this program print?
package mainimport ( "crypto/sha1" "fmt" ) func main() { input := []byte("Hello, playground") hash := sha1.Sum(input)[:5] fmt.Println(hash) }
To my surprise, here is the answer:
./test.go:10:28: invalid operation sha1.Sum(input)[:5] (slice of unaddressable value)
There are three reasons why we are getting this error. To start
with, the necessary enabling factor is that sha1.Sum() has an unusual return
value. Most things that return some bytes return a slice, and this
code would have worked with slices. But sha1.Sum() returns that odd
beast, a fixed-size array ([20]byte
, to be specific), and since
Go is return by value, that means it really does return a 20 byte
array to main()
, not, say, a pointer to it.
That leaves us with the concept of unaddressable values, which are the opposite of addressable values. The careful technical version is in the Go specification in Address operators, but the hand waving summary version is that most anonymous values are not addressable (one big exception is composite literals). Here the return value of sha1.Sum() is anonymous, because we're immediately slicing it. Had we stored it in a variable and thus made it non-anonymous, the code would have worked:
tmp := sha1.Sum(input) hash := tmp[:5]
The final piece of the puzzle is why slicing was an error. That's because slicing an array specifically requires that the array be addressable (this is covered at the end of Slice expressions). The anonymous array that is the return value of sha1.Sum() is not addressable, so slicing it is rejected by the compiler.
(Storing the return value into our tmp
variable makes it addressable.
Well, it makes tmp
and the value in it addressable; the return
value from sha1.Sum() sort of evaporates after it's copied into
tmp
.)
I don't know why the designers of Go decided to put this restriction on what values are addressable, although I can imagine various reasons. For instance, allowing the slicing operation here would require Go to silently materialize heap storage to hold sha1.Sum()'s return value (and then copy the value to it), which would then live on for however long the slice did.
(Since Go returns all values on the stack, as described in "The Go low-level calling convention on x86-64", this would require a literal copy of the data. This is not a big deal for the 20-byte result of sha1.Sum(); I'm pretty sure that people routinely return and copy bigger structures.)
PS: A number of things through the Go language specification require or only operate on addressable things. For example, assignment mostly requires addressability.
Sidebar: Method calls and addressability
Suppose you have a type T
and also some methods defined on *T
,
eg *T.Op()
. Much like Go allows you to do field references without
dereferencing pointers, it allows you to call the pointer methods on
a non-pointer value:
var x T x.Op()
Here Go makes this shorthand for the obvious '(&x).Op()
' (this
is covered in Calls, at the
bottom). However, because this shortcut requires taking the address
of something, it requires addressability. So, what you can't do
is this:
// afunc() returns a T afunc().Op() // But this works: var x T = afunc() x.Op()
I think I've seen people discuss this Go quirk of method calls, but at the time I didn't fully understand what was going on and what exactly made a method call not work because of it.
(Note that this shorthand conversion is fundamentally different
from how *T
has all of the methods of T
, which has come
up before.)