How Go 1.15 improved converting small integer values to interfaces

August 12, 2020

In Go, interface values are famously implemented as a pair of pointers (see Russ Cox's Go Data Structures: Interfaces); a pointer to information about the type and a pointer to the value itself. This generally means that the value must be dynamically allocated in the heap, which means that it will contribute to the work that Go's garbage collection does.

The Go 1.15 release notes mention an intriguing improvement in the runtime section:

Converting a small integer value into an interface value no longer causes allocation.

When I saw that, I immediately wondered how it works, and especially if Go's runtime was now sometimes using the value pointer field in interface values to directly store the value. (There are a number of languages that do this, using various approaches like tag bits to tell values from real pointers.)

The answer turns out to be pretty straightforward, and is in Go CL 216401 (merged in this commit, which may be easier to read). The Go runtime has a special static array of the first 256 integers (0 to 255), and when it would normally have to allocate memory to store an integer on the heap as part of converting it to an interface, it first checks to see if it can just return a pointer to the appropriate element in the array instead. This kind of static allocation of frequently used values is common in languages with lots of dynamic allocation; Python does something similar for small integers, for example (which can sometimes surprise you).

(It turns out that Go previously had an optimization where if you were converting 0 to an interface value, it would return a pointer to a special static zero value. This new optimization for 0-255 replaces that.)

There is one special trick that Go plays here. The actual array is an array of uint64, but it reuses the same array for smaller sized values as well. On little endian systems like x86, this is fine as it stands because a pointer to a 64-bit value is also a valid pointer to that value interpreted as 32 or 16 bits (or 8 bits). But on big endian systems this isn't the case, so if Go is running on a big endian machine it bumps up the pointer so that it works properly (making it point to either the last two bytes or the last four bytes of the 8-byte value).

(On a little endian machine, the pointer is to the low byte of the value and the remaining bytes are all zero so it doesn't matter how many more of them you look at. On a big endian machine, the pointer is to the high byte, but the low byte is the thing that matters.)

As bonus trivia for this change, this new array of 0-255 uint64 values was then reused for avoiding allocating anything for one-byte strings in another change (this commit, CL 221979). Go previously had an array of bytes for this purpose, but why have two arrays. Big endian machines need the same sort of pointer bumping they did for small integers being converted to interface values, but little endian machines can once again use the pointers as is.

PS: There are runtime functions for converting 16, 32, and 64 bit values to interface values, in runtime/iface.go (they can be inlined in actual code), but I was puzzled because there is no runtime function for converting 8 bit values. It turns out that 8-bit values are directly handled by the compiler in walk.go, where it generates inline code that uses the staticuint64s array. This may be done directly in the compiler partly because it needs no fallback path for larger values, unlike the 16, 32, and 64 bit cases, since an 8 bit value will always be in staticuint64s.

Written on 12 August 2020.
« Disabling DNF modules on Fedora 31 so they don't mess up package updates
People often have multiple social identities even in the physical realm »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Aug 12 00:39:20 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.