Why the kernel does mmap()
(beyond efficiency)
I recently read Daniel Ehrenberg's Why not mmap? and ran across the following snippet:
mmap()
is a system call, implemented by the kernel. Why? As far as I can tell, what I described above could be implemented in user-space: user-space has page fault handlers and file read/write operations.
One way to put the reason that the kernel handles mmap()
is
that there is a good argument that the reason mmap()
exists is
shared libraries. And shared is the operative word here. A kernel
implementation of mmap()
makes it easy to share mapped objects between
different processes, in several different ways.
First and most obviously, the kernel can simply share the memory if
several processes mmap()
the same thing. Such as, for example, much of
the C shared library. Second, the kernel can share memory using copy on
write semantics; if you map an object 'private', the kernel can still
share pages that you don't actually write to. This is commonly used for
making necessary per-process modifications to mapped shared libraries,
such as relocations or read-mostly data. Third, the kernel can have a
unified buffer cache so that it doesn't matter
whether processes use mmap()
or read()
and write()
, they all see
the same thing and the kernel doesn't duplicate memory.
It might be possible to emulate the first sort of sharing at user level
between cooperating processes (you'd use System V shared memory). I
don't think that the second sort of sharing is doable at user level
unless you have new address space modification syscalls that allow you
to overlay chunks of regular address space on System V shared memory
(you'd normally do something like that with mmap()
and friends, but
we're assuming that they don't exist). The third sort of sharing is
impossible without kernel assistance, and I don't think one should
under-estimate how important it is; read()
vs mmap()
coherence is
what makes it relatively trivial to replace read()
with mmap()
in
programs.
The first and the second sort of sharing are both very important to get full benefit from shared libraries. If you simply load shared libraries into processes without sharing, all you save is disk space; if you share the memory of shared libraries between processes, you also save precious RAM. Back in the days of SunOS 4, with small memory machines and large shared libraries for things like graphics toolkits, this made a real difference.
(People did shared libraries with special custom hacks before there
was mmap()
, but mmap()
had a number of advantages including being
clearly generic and providing a clean answer for how to deal with
different processes loading the same shared library at different
addresses in memory.)
|
|