2018-10-06
A deep dive into the OS memory use of a simple Go program
One of the enduring mysteries of actually using Go programs is understanding how much OS-level memory they use, as opposed to the various Go-level memory metrics exposed by runtime.MemStats. OS level memory use matters because it influences things like how much real memory your program needs and how likely it is to be killed by the OS in a low-memory situation, but there has always been a disconnect between OS level information and Go level information. After researching enough to write about how Go doesn't free heap memory back to the OS, I got sufficiently curious to really dig down into the details of a very simple program and now I'm going to go through them. All of this is for Go 1.11; other Go versions have had different behavior.
Our very simple program is going to do nothing except sit there so that we can examine its memory use:
package main
func main() { var i uint64 for { i++ } }
(It turns out that we could use time.Sleep() to pause without dragging in
extra complications, because it's actually handled directly in
the runtime,
despite it nominally being in the time
package.)
This simple looking program already has a complicated runtime environment, with several system goroutines operating behind the scene. It also has more memory use than you probably expect. Here's what its memory map looks like on my 64-bit Linux machine:
0000000000400000 316K r-x-- memdemo 000000000044f000 432K r---- memdemo 00000000004bb000 12K rw--- memdemo 00000000004be000 124K rw--- [ bss ] 000000c000000000 65536K rw--- [ anon ] 00007efdfc10c000 35264K rw--- [ anon ] 00007ffc088f1000 136K rw--- [ stack ] 00007ffc08933000 12K r---- [ vvar ] 00007ffc08936000 8K r-x-- [ vdso ] ffffffffff600000 4K r-x-- [ vsyscall ] total 101844K
The vvar, vdso, and vsyscall mappings come from the Linux kernel; the '[ stack ]' mapping is the standard process stack created by the Linux kernel, and the first four mappings are all from the program itself (the actual compiled machine code, the read-only data, plain data, and then the zero'd data respectively). Go itself has allocated the two '[ anon ]' mappings in the middle, which are most of the program's memory use; we have one 64 MB mapping at 0x00c000000000 and one 34.4 MB mapping at 0x7efdfc10c000.
(The addresses for some of these mappings will vary from run to run.)
As described in Allocator Wrestling (see also, and), Go allocates heap memory (including the memory for goroutine stacks) in chunks of memory called spans that come from arenas. Arenas are 64 MB in size and are allocated at fixed locations; on 64-bit Linux, they start at 0x00c000000000. So this is our 64 MB mapping; it is the program's first arena, the only one necessary, which handles all normal Go memory allocation.
If we run our program under strace -e trace=%memory
, we'll
discover that the remaining mysterious mapping actually comes
from a number of separate mmap()
calls that the Linux kernel
has merged together into one memory area. Here is the trace for
our program:
mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efdfe33c000 mmap(0xc000000000, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xc000000000 mmap(0xc000000000, 67108864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xc000000000 mmap(NULL, 33554432, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efdfc33c000 mmap(NULL, 2162688, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efdfc12c000 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efdfc11c000 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efdfc10c000
So we have, in order, a 256 KB allocation, the 64 MB arena allocated at its fixed address, a 32 MB allocation, a slightly over 2 MB allocation, and two 64 KB allocations. Everything except the arena allocation is allocated at successively lower addresses next to each other and gets merged together into the single mapping starting at 0x7efdfc10c000. All of these allocations are internal allocations from the Go runtime, and I'm going to run down them in order.
The initial 256 KB allocation is for the first chunk of the Go
runtime's area for persistent allocations.
These are runtime things that will never be freed up and which can
be (and are) allocated outside of the regular heap arenas. Various
things are allocated in persistent allocations, and the persistent
allocator mostly works in 256 KB chunks that it gets from the OS.
Our first mmap()
is thus the runtime starting to allocate from
this area, which causes the allocator to get its first chunk from
the OS. The memory for these persistent allocator chunks is mostly
recorded
in runtime.MemStats.OtherSys, although it's not the only thing
that falls into that category and some persistent allocations are
in different categories.
The 32 MB allocation immediately after our first arena is for the
heap allocator's "L2" arena map.
As the comments in runtime/malloc.go note, most 64-bit architectures
(including Linux) have only a single large L2 arena map, which has
to be allocated when the first arena is allocated. The next allocation,
which is 2112 KB or 2 MB plus 64 KB, turns out to be for the
heapArena
structure
for our newly allocated arena. It has two fields; the .bitmap
field is 2 MB in size, and the .spans
field is 64 KB (in 8192
8-byte pointers). This explains the odd size requested.
(If I'm reading the code correctly, the L2 arena map isn't accounted
for in any runtime.MemStats value; this may be a bug. The
heapArena
structure is accounted for in runtime.MemStats.GcSys.)
The final two 64 KB allocations are for the initial version of a
data structure used to keep track of all spans (set up in
recordspan()
)
and the allocation for a data structure (gcBits) that is used in
garbage collection (set up in newArenaMayUnlock()
).
The span tracking structure is accounted for in
runtime.MemStats.OtherSys, while the gcBits stuff is in
runtime.MemStats.GcSys.
As your program uses more memory, I believe that in general you can
expect more arenas to be allocated from the OS, and with each arena
you'll also get another arenaHeap
structure. I believe that the
L2 arena map is only allocated once on 64-bit Unix. You will probably
periodically have larger span data structures and more gcBits
structures allocated, and you will definitely periodically have new
256 KB chunks allocated for persistent allocations.
(There are probably other sources of allocations from the OS in
the Go runtime. Interested parties can search through the source
code for calls to sysAlloc()
, persistentalloc()
, and so on.
In the end everything apart from arenas comes from sysAlloc()
,
but there are often layers of indirection.)
PS: If you want to track down this sort of thing yourself, the
easiest way to do it is to run your test program under gdb, set a
breakpoint on runtime.sysAlloc
, and then use where
every time
the breakpoint is hit. On most Unixes, this is the only low level
runtime function that allocates floating anonymous memory with
mmap()
; you can see this in, for example, the Linux version
of low level memory allocation.
Go basically never frees heap memory back to the operating system
Over on Reddit's r/golang, I ran into an interesting question about Go's memory use as part of this general memory question:
[...] However Go is not immediately freeing the memory, at least from
htop
's perspective.What can I do to A) gain insight on when this memory will be made available to the OS, [...]
The usual question about memory usage in Go programs is when things
will be garbage collected (which can be tricky).
However, this person wants to know when Go will return free memory
back to the operating system. This is a good question partly because
programs often don't do very much of this (or really we should say
the versions of malloc()
that programs use don't do this), for
various reasons. Somewhat to my surprise, it turns out that Go
basically never returns memory address space to the OS, as of Go
1.11. In htop
, you can expect normal Go programs to only ever be
constant sized or grow, never to shrink.
(The qualification about Go 1.11 is important, because Go's memory handling changes over time. Back in 2014 with Go 1.5 or so, Go processes used a huge amount of virtual memory, but that's changed since then.)
The Go runtime itself initially allocates memory in relatively decent sized chunks of memory called 'spans', as discussed in the big comment at the start of runtime/malloc.go (and see also this and this (also)); spans are at least 8 KB, but may be larger. If a span has no objects allocated in it, it is an idle span; how many bytes are in idle spans is in runtime.MemStats.HeapIdle. If a span is idle for sufficiently long, the Go runtime 'releases' it back to the OS, although this doesn't mean what you think. Released spans are a subset of idle spans; when a span is released, it still counts as idle.
(In theory the number of bytes of idle spans released back to the operating system is runtime.MemStats.HeapReleased, but you probably want to read the comment about this in the source code of runtime/mstats.go.)
Counting released spans as idle sounds peculiar until you understand something important; Go doesn't actually give any memory address space back to the OS when a span is released. Instead, what Go does is to tell the OS that it doesn't need the contents of the span's memory pages any more and the OS can replace them with zero bytes at its whim. So 'released' here doesn't mean 'return the memory back to the OS', it means 'discard the contents of the memory'. The memory itself remains part of the process and counts as part of the process size (it may or may not count as part of the resident set size, depending on the OS), and Go can immediately use such a released idle span again if it wants to, just as it can a plain idle span.
(On Unix, releasing pages back to the OS consists of calling
madvise()
(Linux,
FreeBSD)
on them with either MADV_FREE
or MADV_DONTNEED
, depending
on the specific Unix. On Windows, Go uses VirtualFree()
with MEM_DECOMMIT
. On versions of Linux with MADV_FREE
, I'm
not sure what happens to your RSS after doing it; some sources
suggest that your RSS doesn't go down until the kernel starts
actually reclaiming the pages from you, which may be some time
later.)
As far as I can tell from inspecting the current runtime code, Go
only very rarely returns memory that it has used back to the operating
system by calling munmap()
or the Windows equivalent. In particular,
once Go has used memory for regular heap allocations it will never
be returned to the OS even if Go has plenty of released idle memory
that's been untouched for a very long time (as far as I can tell).
As a result, the process virtual size that you see in tools like
htop
is basically a high water mark, and you can expect it to
never go down. If you want to know how much memory your Go program
is really using, you need to carefully look at the various bits and
pieces in runtime.MemStats, perhaps exported through net/http/pprof.