Understanding the basic shape of Unix virtual memory management
Although their implementations vary in detail, every modern Unix system
(ie, everyone with
mmap()) has some basic constraints that shape the
general outline of their virtual memory system. While I'm talking about
virtual memory statistics, it's worth running down
this basic shape and what creates it.
(I say this partly because I spend part of writing yesterday's entry sorting bits of this out in my own head.)
mmap() operation is to glue some resource (such as a
portion of a file) into the process's memory address space. Multiple
processes can each map the same resource into their memory space; when
they do, all processes need to share the same pages of physical RAM in
order to keep everyone's view of the resource in sync (in fact you want
this to happen for
write() based IO as well). This implies that a
process's address space is effectively composited together from a bunch
of entities representing different memory areas.
However, this compositing needs to involve a layer of indirection. Processes don't all map a resource at the same address space (for example, a shared library may be mapped at many different addresses), plus processes can overlay private changes on top of shared resources (eg, copy on write for various things). This implies a two step mapping; a process has 'virtual memory areas' holding information used to build its own page table (and to track private versions of pages), and these then point to a shared data structure to keep track of the actual resource, what pages it has in RAM, and so on.
(If we construe 'process' broadly, processes sometimes share VMAs; threads traditionally run in a shared address space, for example.)
In theory, Unix systems could have what I will call coherent page table mappings of resources, where if a single process brought a page of a shared resource into RAM all processes using that resource would get a page table entry for that page. In practice, this would involve a lot of page table changes for processes that don't care about the pages in question and may never refer to it, so I think that basically all Unix processes have incoherent mappings; a page of a resource may be in RAM and be mapped by other processes using it without this process having a PTE for it. When your process tries to refer to that page it will take a soft page fault to establish the PTE it needs and then immediately go on.
(The traditional distinction between soft page faults and hard page faults is that a hard page fault needs to fetch things from disk while a soft page fault just updates things in memory.)
In turn this means that a page of a shared resource can be taken away from your process in one of two ways; call these 'local' and 'global'. A 'local' removal just removes your PTE for the page, but it stays mapped by any other processes that are using it. A 'global' removal makes the page unreferenced by anyone by removing all PTEs in all processes that were (still) using it. Normally you hope to get rid of pages by a series of local removals that result in the page not being mapped by anyone, at which point you can do a free global removal.
This obviously complicates efforts to answer questions about memory usage where it involves shared resources. A shared resource will be using some number of pages of RAM, but not all of those pages will be mapped in any particular process or group of processes. In fact it may not be possible to determine how many active pages a particular resource has because your system may only give you a 'per-process' view of shared resources.