Wandering Thoughts archives

2019-03-08

Exploring how and why interior pointers in Go keep entire objects alive

Recently I was reading Things that surprised me in go (via), where one of the surprising things is that an interior pointer to a struct's field keeps the entire struct from being garbage collected. Here is a cut down and slightly changed version of the example from the article:

func getInteriorPointer() *int {
    type bigStruct struct {
        smallThing   int
        someBigThing [aLot]int
    }
    b := bigStruct{smallThing: 3}
    return &b.smallThing
}

After this returns, the entire bigStruct will remain in memory, not just the int of smallThing. As the author notes, in a hypothetical perfect GC this wouldn't happen; only the int would remain.

The direct reason that Go works this way is that it puts all data for a compound object like a struct or an array into one block of memory. This makes for efficient compound objects (both in terms of memory and in terms of access), at the cost of keeping the entire object alive as a unit. There are a number of versions of this, not just with structs; for example, a slice of a string will keep the entire string alive, not just the substring you've sliced out. The same is true of slices of arrays (such as arrays of bytes), even if you've explicitly limited the capacity of the slice so it can't be quietly grown using the original backing array.

But let's go deeper here. Let's suppose we want a perfect GC that still preserves this property of compound objects. Unfortunately there are at least two significant complications, one in determining what memory is still in use and one in actually freeing the memory.

Today, with indivisible compound objects, Go can keep a simple map from memory addresses to the block of memory that they keep alive and the garbage collector doesn't have to care about the type of a pointer, just its address, because the address alone is what determines how much memory is kept alive. In a world where compound objects are divisible, we must somehow arrange for &b.smallThing and &b to be treated differently by the garbage collector, either by giving the garbage collector access to type information or by having them point to different addresses.

(When planning out a scheme for this, remember that structs can be nested inside other structs. Go makes such struct embedding idiomatic, and it's natural to put your embedded struct at the start of your new struct; in fact I think it's the accepted style for this.)

The simplest way to deal with freeing only part of the memory of an object is for Go to have copying garbage collection. In such a hypothetical situation, you would just copy the int instead of the whole struct and everything would work out nicely. However Go does not have a copying GC today, so at a minimum freeing only part of a compound object leaves you with increased fragmentation of free memory; what was one big block of memory for the compound object gets broken up into fragments of free space with bits of still allocated space in various spots.

Unfortunately this partial freeing doesn't work in practice in Go's current memory allocation system in many cases. To simplify, Go splits up allocations into 'size classes' and then performs block allocation with a given size class (if you want to read more, start here). This is highly efficient and has various other advantages, but it means that you simply can't turn an allocation from one size class into free memory for another size class. Effectively you free either all of it or none of it.

You could change Go's memory allocation system to make this possible, but it's very likely that such a change would have performance impacts (and possibly memory usage impacts). With Go's current non-copying GC, you might well find that it often simply wasn't worth it to free up part of an object, either in memory savings or in CPU usage. Certainly you would have more complex memory allocation and garbage collection, and those are already complicated enough.

(With a copying GC, your new copied object can be allocated in the right size class, since the GC definitely knows how big it is now.)

programming/GoInteriorPointerGC written at 22:33:13; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.