What I know about process virtual size versus RSS on Linux
Up until very recently, I would
have confidently told you that a Linux process's 'virtual size' was always at least as large as its resident
set size. After all, how could it be
otherwise? Your 'virtual size' was the total amount of mapped address
space you had, the resident set size was how many pages you had in
memory, and you could hardly have pages in memory without having
them as part of your mapped address space. As Julia Evans has
discovered,
this is apparently not the case; in top
terminology, it's possible
to have processes with RES (ie RSS) and SHR
that is larger than VIRT. So here is what I know about this.
To start with, top
extracts this information from /proc/PID/statm
,
and this information is the same as what you can find as VmSize
and VmRSS
in /proc/PID/status
. Top doesn't manipulate or
postprocess these numbers (apart from converting them all from pages
to Kb or other size units), so what you see it display is a faithful
reproduction of what the kernel is actually reporting.
However, these two groups of numbers are maintained by different
subsystems in the kernel's memory management system; there is nothing
that directly ties them together or forces them to always be in
sync. VmSize, VmPeak, VmData, and several other numbers come from
per-mm_struct
counters such as mm->total_vm
; per Rick
Branson
these numbers are mostly maintained through vm_stat_account
in
mm/mmap.c.
These numbers change when you make system calls like mmap()
and
mremap()
(or when the kernel does similar things internally).
Meanwhile, VmRSS, VmSwap, top's SHR, and RssAnon, RssFile, and
RssShmem all come from page tracking, which mostly involves calling
things like
inc_mm_counter
and add_mm_counter
in places like mm/memory.c;
these numbers change when pages are materialized and de-materialized
in various ways.
(You can see where all of the memory stats in status
come from in
task_mem
in fs/proc/task_mmu.c.)
I don't have anywhere near enough knowledge about the Linux kernel memory system to know if there's any way for a process to acquire a page through a path where it isn't accounted for in VmSize. One would think not, but clearly something funny is going on. On the other hand, this doesn't appear to be a common thing, because I wrote a simple brute-force checker script that compared every process's VmSize to its VmRSS, and I couldn't find any such odd process on any of our systems (a mixture of Ubuntu 12.04, 14.04, and 16.04, Fedora 25, and CentOS 6 and 7). It's quite possible that this requires a very unusual setup; Julia Evans' case is (or was) an active Chrome process and Chrome is known to play all sorts of weird games with its collection of processes that very few other programs do.
(If you find such a case it would be quite interesting to collect
/proc/PID/smaps
, which might show which specific mappings are
doing this.)
PS: The one area of this that makes me wonder is how RSS is tracked
over fork()
, because there seem to be at least some oddities
there. Or
perhaps the child does not get PTEs and thus RSS for the mappings
it shares with the parent until it touches them in some way.
|
|