Recently I wound up reading Linux Memory Reporting
(via Hacker News),
where Robert Haas talks about Linux's lack of clear reporting on
process memory use. Today I'm going to sort of answer his question
by covering what information Linux gives you about the various
sorts of memory usage that you could
be curious about. My primary focus is going to be on the numbers
that you can get with ps, top, and smem. The background information on
general Unix memory management
will be helpful.
So, what you can get:
- the total amount of virtual address space that your process currently
has allocated and mapped is the 'virtual size' of your process;
ps
reports this as VSZ and top reports it as VIRT. This includes
anything the process has mapped, regardless of how it got there;
the program's code, shared libraries, (System V) shared memory areas,
mmap()'d files, mmap()'d private anonymous memory areas (which are
often used by the C library for malloc()), everything.
If your program is in a steady state but your VSZ keeps increasing,
you have some sort of allocation leak. It may not strictly be a memory
leak; you might be forgetting to unmap files or unload dynamically
loaded code or something.
(You can check at least some of this with lsof.)
- how much RAM would be immediately freed up if this process exited
is
smem's USS ('unique set size') field; this counts pages of
RAM that the process is the only user of. These pages may be
private pages (pages that will never be accessible by anyone else),
or they may be shared pages that are only actively used by this
process.
(smem gets this information from the per-process smaps
proc file.)
- how much RAM your program has looked at recently (which is roughly
how much RAM it needs to be happy if it wasn't sharing anything) is
the 'resident set size', reported as
RSS by ps and smem
and RES by top. The resident set size doesn't care whether or
not some of that RAM is also used by other processes; each process
counts it up separately.
(In the terminology of my basic Unix memory management entry, a process's RSS is just how
many page table entries in its virtual memory areas point to
real RAM.)
Your process's RSS increases every time it looks at a new piece
of memory (and thereby establishes a page table entry for it).
It decreases as the kernel removes PTEs that haven't been used
sufficiently recently; how fast this happens depends on how much
memory pressure the overall system is under. The more memory pressure,
the more the kernel tries to steal pages from processes and decrease
their RSS.
If you have a memory leak it's routine for your RSS to stay constant
while your VSZ grows. After all, you aren't looking at that leaked
memory any more.
A large RSS on an active system (one under memory pressure) means
that your process touches a lot of memory (often rapidly) during
its operation. A growing RSS means that it is increasing the amount
of memory it touches. A constant RSS doesn't mean that the process
is touching the same memory over and over; it just means that it's
touching about the same amount of memory per unit time.
- the process's fair share of currently in use RAM is
smem's PSS
('proportional set size') field. This prorates shared pages of
RAM by charging each process for 1/Nth of the page, where N is
how many processes currently have a page table entry for the page
(the degenerate case is that you are charged the full page if you
are the only user, ie this would be counted as part of your USS).
Note that this is not how many processes have the shared resource
mapped into their address space, it is how many processes have
touched the page recently (ie, have it in their RSS). Mapping a
shared resource is free (except to your VSZ); looking at it is
what costs you here.
It follows that the more processes actively look at pages of a shared
resource, the lower each of their PSS goes for it (because more and
more processes map the same pages from it).
(Like USS, smem gets this information from the per-process smaps
proc file.)
Because of how it's defined, summing the per-process PSS for a resource
over all of the processes using that resource will tell you how much
RAM that resource is using. Smem can do this (for some resources) with
'smem -m', although you need to know a certain amount about how Linux
gives names to various resources in order to understand smem's output
here.
(If you have all of the processes of interest running under a single
userid, you can also use 'smem -u'. Smem doesn't currently have an
option to aggregate reporting by program, so you can't do things like
see how much memory your httpd processes are collectively using.)
As far as I know, Linux has no per-process or global number for how much
of your virtual address size has ever been looked at (my second question
in the six different meanings of memory usage).
Nor can you get per-process information on how much memory the operating
system might need to provide if your process wrote to everything it was
entitled to (the sixth question), although you can get system-wide
information on committed address space.
Top reports a SHR number but it's not clear to me how useful this is,
partly because top doesn't document where it gets this information
from. If I am reading the kernel code correctly, the most likely source
is the (process) RSS for memory areas that were mmap()'d from files. I
am not sure if this includes things like System V shared memory areas,
and certainly it understates the potential sharing between, say,
fork()'d processes. This is also only potential sharing, since it
says nothing about whether or not any other process has mmap()'d the
same object.
(Ie, if your single process mmap()'s a private two gigabyte file and
then scans all of it, I believe that your SHR will be two gigabytes
and change.)
Sidebar: answers to Bruce Momjian's questions
From his comment on Robert Haas's entry:
There are various methods for representing memory that is shared,
either via SysV shared memory, fork's copy-on-write, or shared
libraries. Does every process get charged the full amount, or do they
split it among themselves, e.g. if five processes use shared memory,
is each process charged 20% of the total size? (If another process
attaches, does your percentage decrease?) What happens when you map
in a large shared memory area but only access part of it? When do you
stop using that memory?
Each process is charged the full amount to VSZ, but not to other
numbers. When you map a large area but only refer to some of it, your
VSZ goes up by the full amount but your RSS only goes up by the amount
you access (and then goes down again at some rate if you don't access it
and the system is under memory pressure). Your PSS is the only number
that goes down if other people attach to the shared resource and
also actually look at pages of that shared resource that you are also
looking at (if they attach but don't look, your PSS doesn't change). If
five processes all map the same shared memory segment but look at five
different portions of it, each of them will be charged separately for
their portion (their PSS for the segment will be the same as their USS);
if they all look at the same portion, their PSS is 1/5th of the size of
the portion.
(Your RSS never changes when people attach or detach from a shared
resource.)