Wandering Thoughts archives

2020-05-14

Exploring munmap() on page zero and on unmapped address space

Over in the Fediverse, I ran across an interesting question on munmap():

what does `munmap` on Linux do when address is set to 0? Somehow this succeeds on Linux but fails on FreeBSD. I'm assuming the semantics are different but cannot find any reference regarding to such behavior.

(There's also this additional note, and the short version of the answer is here.)

When I saw this, I was actually surprised that munmap() on Linux succeeded, because I expected it to fail on any address range that wasn't currently mapped in your process and page zero is definitely not mapped on Linux (or anywhere sane). So let's go to the SUS specification for munmap(), where we can read in part:

The munmap() function shall fail if:

[EINVAL]
Addresses in the range [addr,addr+len) are outside the valid range for the address space of a process.

(Similar wording appears in the FreeBSD munmap() manpage.)

When I first read this wording, I assumed that this meant the current address range of the process. This is incorrect in practice on Linux and FreeBSD, and I think in theory as well (since POSIX/SUS talks about 'of a process', not 'of this process'). On both of those Unixes, you can munmap() at least some unused address space, as we can demonstrate with a little test program that mmap()s something, munmap()s it, and then munmap()s it again.

The difference between Linux and FreeBSD is in what they consider to be 'outside the valid range for the address space of a process'. FreeBSD evidently considers page zero (and probably low memory in general) to always be outside this range, and thus munmap() fails. Linux does not; while it doesn't normally let you mmap() memory in that area, for good reasons, it is not intrinsically outside the address space. If I'm reading the Linux kernel code correctly, no low address range is ever considered invalid, only address ranges that cross above the top of user space.

(I took a brief look at the relevant FreeBSD code in vm_mmap.c, and I think that it rejects any munmap() that extends below or above the range of address space that the process currently has mapped. This is actually more restrictive than I expected.)

In ultimately unsurprising news, OpenBSD takes a somewhat different interpretation, one that's more in line with how I expected munmap() to behave. The OpenBSD munmap() manpage says:

[EINVAL]
The addr and len parameters specify a region that would extend beyond the end of the address space, or some part of the region being unmapped is not part of the currently valid address space.

OpenBSD requires you to only munmap() things that are actually mapped and disallows trying to unmap random sections of your potential address space, even if it falls within the bottom and top of your address space usage (where FreeBSD would allow it). Whether this is completely POSIX compliant is an interesting but irrelevant question, since I doubt the OpenBSD people would change this (and I don't think they should).

One of the interesting things I've learned from looking into this is that Linux, FreeBSD, and OpenBSD each sort of have a different interpretation of what POSIX permits (assuming I'm understanding the FreeBSD kernel code correctly). The Linux interpretation is most clearly permitted, since it allows munmap() on anything that might potentially be mappable under some circumstances. OpenBSD, if it cares, would likely say that the 'valid range for the address space of a process' is what it currently has mapped and so their behavior is POSIX/SUS compliant, but this is clearly pushing the interpretation in an unusual direction from a narrow specification style reading of the wording (although it is the behavior I expected). FreeBSD sort of splits the difference, possibly for implementation reasons.

PS: The Linux munmap() manpage doesn't even talk about 'the valid address space of a (or the) process' as a reason for munmap() to fail; it only talks abstractly about the kernel not liking addr or len.

Sidebar: The little test program

Here's the test program I used.

#include <sys/mman.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>

#define MAPLEN  (128*1024)

int main(int argc, char **argv)
{
  void *mp;

  puts("Starting mmap and double munmap test.");
  mp = mmap(0, MAPLEN, PROT_READ, MAP_ANON|MAP_SHARED, -1, 0);
  if (mp == MAP_FAILED) {
    printf("mmap error: %s\n", strerror(errno));
    return 1;
  }
  if (munmap(mp, MAPLEN) < 0) {
    printf("munmap error on first unmap: %s\n", strerror(errno));
    return 1;
  }
  if (munmap(mp, MAPLEN) < 0) {
    printf("munmap error on second unmap: %s\n", strerror(errno));
    return 1;
  }
  puts("All calls succeeded without errors, can munmap() unmapped areas.");
  return 0;
}

I think that it's theoretically possible for something like this program to fail on FreeBSD, if our mmap() established a new top or bottom of the process's address space. In practice it's likely that we will mmap() into a hole between the bottom of the address space (with the program text) and the top of the address space (probably with the stack).

unix/MunmapPageZero written at 23:46:42; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.