How Go 1.15 improved converting small integer values to interfaces
In Go, interface values are famously implemented as a pair of pointers (see Russ Cox's Go Data Structures: Interfaces); a pointer to information about the type and a pointer to the value itself. This generally means that the value must be dynamically allocated in the heap, which means that it will contribute to the work that Go's garbage collection does.
The Go 1.15 release notes mention an intriguing improvement in the runtime section:
Converting a small integer value into an interface value no longer causes allocation.
When I saw that, I immediately wondered how it works, and especially if Go's runtime was now sometimes using the value pointer field in interface values to directly store the value. (There are a number of languages that do this, using various approaches like tag bits to tell values from real pointers.)
The answer turns out to be pretty straightforward, and is in Go CL 216401 (merged in this commit, which may be easier to read). The Go runtime has a special static array of the first 256 integers (0 to 255), and when it would normally have to allocate memory to store an integer on the heap as part of converting it to an interface, it first checks to see if it can just return a pointer to the appropriate element in the array instead. This kind of static allocation of frequently used values is common in languages with lots of dynamic allocation; Python does something similar for small integers, for example (which can sometimes surprise you).
(It turns out that Go previously had an optimization where if you were converting 0 to an interface value, it would return a pointer to a special static zero value. This new optimization for 0-255 replaces that.)
There is one special trick that Go plays here. The actual array is
an array of uint64
, but it reuses the same array for smaller sized
values as well. On little endian systems like x86, this
is fine as it stands because a pointer to a 64-bit value is also a
valid pointer to that value interpreted as 32 or 16 bits (or 8
bits). But on big endian systems this isn't the case, so if Go is
running on a big endian machine it bumps up the pointer so that it
works properly (making it point to either the last two bytes or the
last four bytes of the 8-byte value).
(On a little endian machine, the pointer is to the low byte of the value and the remaining bytes are all zero so it doesn't matter how many more of them you look at. On a big endian machine, the pointer is to the high byte, but the low byte is the thing that matters.)
As bonus trivia for this change, this new array of 0-255 uint64
values was then reused for avoiding allocating anything for one-byte
strings in another change (this commit,
CL 221979).
Go previously had an array of bytes for this purpose, but why have
two arrays. Big endian machines need the same sort of pointer bumping
they did for small integers being converted to interface values,
but little endian machines can once again use the pointers as is.
PS: There are runtime functions for converting 16, 32, and 64 bit
values to interface values, in runtime/iface.go
(they can be inlined in actual code), but I was puzzled because
there is no runtime function for converting 8 bit values. It turns
out that 8-bit values are directly handled by the compiler in
walk.go,
where it generates inline code that uses the staticuint64s
array.
This may be done directly in the compiler partly because it needs
no fallback path for larger values, unlike the 16, 32, and 64 bit
cases, since an 8 bit value will always be in staticuint64s
.
|
|