An interesting way to leak memory with Go slices

October 10, 2017

Today I was watching Prashant Varanasi's Go release party talk Analyzing production using Flamegraphs, and starting at around 28 minutes in the talk he covered an interesting and tricky memory leak involving slices. For my own edification, I'm going to write down a version of the memory leak here and describe why it happens.

To start with, the rule of memory leaks in garbage collected languages like Go is that you leak memory by retaining unexpected references to things. The garbage collector will find and free things for you, but only if they're actually unused. If you're retaining a reference to them, they stick around. Sometimes this is ultimately straightforward (perhaps you're deliberately holding on to a small structure but not realizing that it has a reference to a bigger one), but sometimes the retention is hiding in the runtime implementation of something. Which brings us around to slices.

Simplified, the code Prashant was dealing with was maintaining a collection of currently used items in a slice. When an item stopped being used, it was rotated to the end of the slice and then the slice was shrunk by truncating it (maintaining the invariant that the slice only had used items in it). However, shrinking a slice doesn't shrink its backing array; in Go terms, it reduces the length of a slice but not its capacity. With the underlying backing array untouched, that array retained a reference to the theoretically discarded item and all other objects that the item referenced. With a reference retained, even one invisible to the code, the Go garbage collector correctly saw the item as still used. This resulted in a memory leak, where items that the code thought had been discarded weren't actually freed up.

Now that I've looked at the Go runtime and compiler code and thought about the issue a bit, I've come to the obvious realization that this is a generic issue with any slice truncation. Go never attempts to shrink the backing array of a slice, and it's probably impossible to do so in general since a backing array can be shared by multiple slices or otherwise referenced. This obviously strongly affects slices that refer to things that contain pointers, but it may also matter for slices of plain old data things, especially if they're comparatively big (perhaps you have a slice of Points, with three floats per point).

For slices containing pointers or structures with pointers, the obvious fix (which is the fix adopted in Uber's code) is to nil out the trailing pointer(s) before you truncate the slice. This retains the backing array at full size but discards references to other memory, and it's the other memory where things really leak.

For slices where the actual backing array may consume substantial memory, I can think of two things to do here, one specific and one generic. The specific thing is to detect the case of 'truncation to zero size' in your code and specifically null out the slice itself, instead of just truncating it with a standard slice truncation. The generic thing is to explicitly force a slice copy instead of a mere truncation (as covered in my entry on slice mutability). The drawback here is that you're forcing a copy, which might be much more expensive. You could optimize this by only forcing a copy in situations where the slice capacity is well beyond the new slice's length.

Sidebar: Three-index slice truncation considered dangerous (to garbage collection)

Go slice expressions allow a rarely used third index to set the capacity of the new slice in addition to the starting and ending points. You might thus be tempted to use this form to limit the slice as a way of avoiding this garbage collection issue:

slc = slc[:newlen:newlen]

Unfortunately this doesn't do what you want it to and is actively counterproductive. Setting the new slice's capacity doesn't change the underlying backing array in any way or cause Go to allocate a new one, but it does mean that you lose access to information about the array's size (which would otherwise be accessible through the slice's capacity). The one effect it does have is forcing a subsequent append() to reallocate a new backing array.

Written on 10 October 2017.
« JavaScript as the extension language of the future
Understanding M.2 SSDs in practice and how this relates to NVMe »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Oct 10 00:26:09 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.