What you can find out about the memory usage of your Linux programs
Recently I wound up reading Linux Memory Reporting
(via Hacker News),
where Robert Haas talks about Linux's lack of clear reporting on
process memory use. Today I'm going to sort of answer his question
by covering what information Linux gives you about the various
sorts of memory usage that you could
be curious about. My primary focus is going to be on the numbers
that you can get with
smem. The background information on
general Unix memory management
will be helpful.
So, what you can get:
- the total amount of virtual address space that your process currently
has allocated and mapped is the 'virtual size' of your process;
psreports this as
topreports it as
VIRT. This includes anything the process has mapped, regardless of how it got there; the program's code, shared libraries, (System V) shared memory areas,
mmap()'d private anonymous memory areas (which are often used by the C library for
If your program is in a steady state but your VSZ keeps increasing, you have some sort of allocation leak. It may not strictly be a memory leak; you might be forgetting to unmap files or unload dynamically loaded code or something.
(You can check at least some of this with
- how much RAM would be immediately freed up if this process exited
USS('unique set size') field; this counts pages of RAM that the process is the only user of. These pages may be private pages (pages that will never be accessible by anyone else), or they may be shared pages that are only actively used by this process.
smemgets this information from the per-process
- how much RAM your program has looked at recently (which is roughly
how much RAM it needs to be happy if it wasn't sharing anything) is
the 'resident set size', reported as
top. The resident set size doesn't care whether or not some of that RAM is also used by other processes; each process counts it up separately.
(In the terminology of my basic Unix memory management entry, a process's RSS is just how many page table entries in its virtual memory areas point to real RAM.)
Your process's RSS increases every time it looks at a new piece of memory (and thereby establishes a page table entry for it). It decreases as the kernel removes PTEs that haven't been used sufficiently recently; how fast this happens depends on how much memory pressure the overall system is under. The more memory pressure, the more the kernel tries to steal pages from processes and decrease their RSS.
If you have a memory leak it's routine for your RSS to stay constant while your VSZ grows. After all, you aren't looking at that leaked memory any more.
A large RSS on an active system (one under memory pressure) means that your process touches a lot of memory (often rapidly) during its operation. A growing RSS means that it is increasing the amount of memory it touches. A constant RSS doesn't mean that the process is touching the same memory over and over; it just means that it's touching about the same amount of memory per unit time.
- the process's fair share of currently in use RAM is
PSS('proportional set size') field. This prorates shared pages of RAM by charging each process for 1/Nth of the page, where N is how many processes currently have a page table entry for the page (the degenerate case is that you are charged the full page if you are the only user, ie this would be counted as part of your USS). Note that this is not how many processes have the shared resource mapped into their address space, it is how many processes have touched the page recently (ie, have it in their RSS). Mapping a shared resource is free (except to your VSZ); looking at it is what costs you here.
It follows that the more processes actively look at pages of a shared resource, the lower each of their PSS goes for it (because more and more processes map the same pages from it).
smemgets this information from the per-process
Because of how it's defined, summing the per-process PSS for a resource
over all of the processes using that resource will tell you how much
RAM that resource is using. Smem can do this (for some resources) with
smem -m', although you need to know a certain amount about how Linux
gives names to various resources in order to understand
(If you have all of the processes of interest running under a single
userid, you can also use '
smem -u'. Smem doesn't currently have an
option to aggregate reporting by program, so you can't do things like
see how much memory your
httpd processes are collectively using.)
As far as I know, Linux has no per-process or global number for how much of your virtual address size has ever been looked at (my second question in the six different meanings of memory usage). Nor can you get per-process information on how much memory the operating system might need to provide if your process wrote to everything it was entitled to (the sixth question), although you can get system-wide information on committed address space.
Top reports a
SHR number but it's not clear to me how useful this is,
top doesn't document where it gets this information
from. If I am reading the kernel code correctly, the most likely source
is the (process) RSS for memory areas that were
mmap()'d from files. I
am not sure if this includes things like System V shared memory areas,
and certainly it understates the potential sharing between, say,
fork()'d processes. This is also only potential sharing, since it
says nothing about whether or not any other process has
(Ie, if your single process
mmap()'s a private two gigabyte file and
then scans all of it, I believe that your
SHR will be two gigabytes
Sidebar: answers to Bruce Momjian's questions
There are various methods for representing memory that is shared, either via SysV shared memory, fork's copy-on-write, or shared libraries. Does every process get charged the full amount, or do they split it among themselves, e.g. if five processes use shared memory, is each process charged 20% of the total size? (If another process attaches, does your percentage decrease?) What happens when you map in a large shared memory area but only access part of it? When do you stop using that memory?
Each process is charged the full amount to VSZ, but not to other numbers. When you map a large area but only refer to some of it, your VSZ goes up by the full amount but your RSS only goes up by the amount you access (and then goes down again at some rate if you don't access it and the system is under memory pressure). Your PSS is the only number that goes down if other people attach to the shared resource and also actually look at pages of that shared resource that you are also looking at (if they attach but don't look, your PSS doesn't change). If five processes all map the same shared memory segment but look at five different portions of it, each of them will be charged separately for their portion (their PSS for the segment will be the same as their USS); if they all look at the same portion, their PSS is 1/5th of the size of the portion.
(Your RSS never changes when people attach or detach from a shared resource.)