The history of Unix's confusing set of low-level ways to allocate memory

June 7, 2018

Once upon a time, the Unix memory map of a process was a very simple thing. You had text, which was the code of the program (and later read-only data), the stack, which generally started at the top of memory and grew down, initialized data right after the text, and then bss (for variables and so on that started out as zero). At the very top of the bss, ie the highest address in the process's data segment, was what was called the program break (or, early on, just the break). The space between the program break and the bottom of the stack was unused and not actually in your process's address space, or rather it started out as unused. If you wanted to get more free memory that your program could use, you asked the operating system to raise this point, with what were even in V7 two system calls: brk() and sbrk(). This is directly described in the description of brk():

char *brk(addr) [...]

Brk sets the system's idea of the lowest location not used by the program (called the break) to addr (rounded up to the next multiple of 64 bytes on the PDP11, 256 bytes on the Interdata 8/32, 512 bytes on the VAX-11/780). Locations not less than addr and below the stack pointer are not in the address space and will thus cause a memory violation if accessed.

Unix programs used brk() and sbrk() to create the heap, which is used for dynamic memory allocations via things like malloc(). The heap in classical Unix was simply the space you'd added between the top of the bss and the current program break. Usually you didn't call brk() yourself but instead left it to the C library's memory allocation functions to manage for you.

(There were exceptions, including the Bourne shell's very creative approach to memory management.)

All of this maintained Unix's simple linear model of memory, even as Unix moved to the fully page-based virtual memory of the DEC Vax. When functions like malloc() ran out of things on their free list of available space, they'd increase the break, growing the process's memory space up, and use the new memory as more space. If you free()d the right things to create a block of unused space at the top of the break, malloc() and company might eventually call brk() or sbrk() to shrink the program's break and give the memory back to the OS, but you probably didn't want to count on that.

This linear memory simplicity had its downsides. For example, fragmentation was a real worry and unless you felt like wasting memory it was difficult to have different 'arenas' for different sorts of memory allocation. And, as noted, Unix programs rarely shrank the amount of virtual memory that they used, which used to matter a lot.

Then, in SunOS 4, Unix got mmap(), which lets people add (and remove) pages of virtual memory anywhere in the process's memory space, not just right above the program's break (or just below the bottom of the stack). This includes anonymous mappings, which are just pages of memory exactly like the pages of memory that you add to the heap by calling sbrk(). It didn't take the people writing implementations of malloc() very long to realize that they could take advantage of this in various ways; for example, they could mmap() several different chunks of address space and use them for arenas, or they could directly allocate sufficiently large objects by direct mmap() (and then directly free them back to the operating system by dropping the mappings). Pretty soon people were using mmap() not just to map files into memory but also to allocate general dynamic memory (which was still called the 'heap', even if it was no longer continuous and linear).

Over time, there's been a tendency for more and more memory allocation libraries and systems to get most or all of their memory from Unix through mmap(), not by manipulating the old-school heap by using sbrk() to change the program break. Often using mmap() only is simpler, and it's also easier to coexist with other memory allocation systems because you're not all fighting over the program break; each mmap() allocation can be manipulated separately by different pieces of code, and all you have to do is worry about not running out of address space (which is generally not a worry on modern 64-bit systems).

(For example, the Go runtime allocates all of its memory through mmap().)

Today, it's generally safe to assume that the memory for almost any large single memory allocation will be obtained from the Unix kernel by mmap(), not by growing the classical heap through sbrk(). In some C libraries and some environments, smaller memory allocations may still come from the classical heap; if you're curious, you can tell by pointing a system call tracer at a program to see if it even calls sbrk() or brk(). How frequently used brk() is probably depends on the Unix (and on Linux, on the C library). I know that GNU libc does use brk() based allocation for programs that only make small allocations, for example /usr/bin/echo.

(Using the classical heap through brk() has some minor advantages, including that it usually doesn't create an additional kernel virtual memory area and those usually have some costs associated with them.)

The current state of low-level Unix memory allocation has thus wound up being somewhat confusing, but that's Unix for you; our current situation is the result of a complicated historical evolution that has been surprisingly focused on backward compatibility. I don't think anyone has ever seriously proposed throwing out brk() entirely, although several of the BSDs call it a historical curiosity or a legacy interface (OpenBSD, FreeBSD), and I suspect that their C libraries never use them for memory allocation.

(This entry was sparked by reading Povilas Versockas' Go Memory Management.)


Comments on this page:

This made me curious, so I looked. On current macOS, brk(2) starts their description with the boldfaced text:

The brk and sbrk functions are historical curiosities left over from earlier days before the advent of virtual memory management.

The “see also” section includes end(3) which I had never heard of, which is a similar legacy shim:

These routines provide a stopgap measure to programs that use the UNIX link-editor defined symbols. Use of these routines is very strongly discouraged. …

So use at your own risk, and only if you know what your [sic] doing. Or better yet, convert the program to use the appropriate Mach or Mach-O functions. If you are trying to allocate memory use vm_allocate(2), if you are trying to find out about your address space use vm_region(2) and if you are trying to find out where your program is loaded use the dyld(3) functions.

This has been a fascinating side-trip into memory land. I vaguely knew, for instance, that apr uses a lot of pools (arenas), but I never gave much thought to how they were implemented under the hood.

FreeBSD actually did throw away sbrk, in a way:

it was never implemented for AArch64 and RISC-V.

Very few ports broke. Important ones (some emacs variant IIRC?) were fixed to not use sbrk.

Written on 07 June 2018.
« The downsides of processing files using too large a buffer size
Networks form through usage and must be maintained through it »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Jun 7 01:15:20 2018
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.