2020-05-14
Exploring munmap()
on page zero and on unmapped address space
Over in the Fediverse, I ran across an interesting question on
munmap()
:
what does `munmap` on Linux do when address is set to 0? Somehow this succeeds on Linux but fails on FreeBSD. I'm assuming the semantics are different but cannot find any reference regarding to such behavior.
(There's also this additional note, and the short version of the answer is here.)
When I saw this, I was actually surprised that munmap()
on Linux
succeeded, because I expected it to fail on any address range that
wasn't currently mapped in your process and page zero is definitely
not mapped on Linux (or anywhere sane). So let's go to the SUS
specification for munmap()
,
where we can read in part:
The
munmap()
function shall fail if:
- [EINVAL]
- Addresses in the range [
addr
,addr+len
) are outside the valid range for the address space of a process.
(Similar wording appears in the FreeBSD munmap()
manpage.)
When I first read this wording, I assumed that this meant the current
address range of the process. This is incorrect in practice on Linux
and FreeBSD, and I think in theory as well (since POSIX/SUS talks
about 'of a process', not 'of this process'). On both of those
Unixes, you can munmap()
at least some unused address space, as
we can demonstrate with a little test program that mmap()
s
something, munmap()
s it, and then munmap()
s it again.
The difference between Linux and FreeBSD is in what they consider to be
'outside the valid range for the address space of a process'. FreeBSD
evidently considers page zero (and probably low memory in general) to
always be outside this range, and thus munmap()
fails. Linux does not;
while it doesn't normally let you mmap()
memory in that area, for
good reasons, it is not intrinsically outside the address space. If I'm
reading the Linux kernel code correctly, no low address range is ever
considered invalid, only address ranges that cross above the top of
user space.
(I took a brief look at the relevant FreeBSD code in vm_mmap.c,
and I think that it rejects any munmap()
that extends below or
above the range of address space that the process currently has
mapped. This is actually more restrictive than I expected.)
In ultimately unsurprising news, OpenBSD takes a somewhat
different interpretation, one that's more in line with how I
expected munmap()
to behave. The OpenBSD munmap()
manpage says:
- [EINVAL]
- The
addr
andlen
parameters specify a region that would extend beyond the end of the address space, or some part of the region being unmapped is not part of the currently valid address space.
OpenBSD requires you to only munmap()
things that are actually
mapped and disallows trying to unmap random sections of your potential
address space, even if it falls within the bottom and top of your
address space usage (where FreeBSD would allow it). Whether this
is completely POSIX compliant is an interesting but irrelevant
question, since I doubt the OpenBSD people would change this (and
I don't think they should).
One of the interesting things I've learned from looking into this
is that Linux, FreeBSD, and OpenBSD each sort of have a different
interpretation of what POSIX permits (assuming I'm understanding
the FreeBSD kernel code correctly). The Linux interpretation is
most clearly permitted, since it allows munmap()
on anything that
might potentially be mappable under some circumstances. OpenBSD,
if it cares, would likely say that the 'valid range for the address
space of a process' is what it currently has mapped and so their
behavior is POSIX/SUS compliant, but this is clearly pushing the
interpretation in an unusual direction from a narrow specification
style reading of the wording (although it is the behavior I expected).
FreeBSD sort of splits the difference, possibly for implementation
reasons.
PS: The Linux munmap()
manpage doesn't even
talk about 'the valid address space of a (or the) process' as a
reason for munmap()
to fail; it only talks abstractly about the
kernel not liking addr
or len
.
Sidebar: The little test program
Here's the test program I used.
#include <sys/mman.h> #include <stdio.h> #include <errno.h> #include <string.h> #define MAPLEN (128*1024) int main(int argc, char **argv) { void *mp; puts("Starting mmap and double munmap test."); mp = mmap(0, MAPLEN, PROT_READ, MAP_ANON|MAP_SHARED, -1, 0); if (mp == MAP_FAILED) { printf("mmap error: %s\n", strerror(errno)); return 1; } if (munmap(mp, MAPLEN) < 0) { printf("munmap error on first unmap: %s\n", strerror(errno)); return 1; } if (munmap(mp, MAPLEN) < 0) { printf("munmap error on second unmap: %s\n", strerror(errno)); return 1; } puts("All calls succeeded without errors, can munmap() unmapped areas."); return 0; }
I think that it's theoretically possible for something like this
program to fail on FreeBSD, if our mmap()
established a new top
or bottom of the process's address space. In practice it's likely
that we will mmap()
into a hole between the bottom of the address
space (with the program text) and the top of the address space
(probably with the stack).