2013-10-03
What is in /proc/self/mountstats
for NFS mounts: an introduction
As I discovered recently, for several years
Linux kernels have made a huge amount of per-mount NFS performance
statistics visible in /proc/self/mountstat
. Unfortunately none of
this is documented. Because I have a use for the information and I'm a
glutton for punishment, I'm going to write up what I've found out.
Mountstats contains so much information that this is going to take
several entries.
To start with, let's talk about the overall format of a filesystem entry. In a relatively recent kernel, this looks like the following for an NFS v3 mount over TCP (probably a common case):
device fs1:/cs/mail mounted on /var/mail with fstype nfs statvers=1.0 opts: rw,[... many ...] age: 11343100 caps: caps=0x3fc7,wtmult=512,dtsize=8192,bsize=0,namlen=255 sec: flavor=1,pseudoflavor=1 events: [... numbers ...] bytes: [... numbers ...] RPC iostats version: 1.0 p/v: 100003/3 (nfs) xprt: tcp [... numbers ...] per-op statistics NULL: 0 0 0 0 0 0 0 0 [more operations]
(Some kernel configurations may produce an additional line with
'fsc:
'. NFS v4 mounts will have a bunch of additional information
that I haven't looked into because we don't have any.)
The entry ends with a blank line (not shown). There are no less than
four different sets of statistics in this; events:
for high-level NFS
events, bytes:
for actual data being read and written, xprt:
for
low level NFS RPC activity, and then relatively detailed statistics for
each NFS protocol operation. If you're hunting performance issues you
may wind up looking at all of them. As is usual for kernel stats, all
of the numbers are 'from the beginning of time' ones and just count up
as things happen; if you want to get per-second statistics or the like
you need to read the file more than once then work out the difference
between your two readings.
Describing most of the fields in mountstats
is sufficiently
complicated that it needs separate entries: bytes:
and high-level
NFS events:
, xprt:
NFS RPC
information, and the per-operation statistics.
Update (2018): See also xprt:
data is per-fileserver, not
per-mount for an important update about
NFS RPC information.
The age:
field is how many seconds this particular NFS mount has been
in existence. You can use this to compute overall per-interval stats
from all of the counters if you're so inclined, although I don't usually
find overall stats very useful (since our activity is invariably very
bursty).
There are currently a number of programs that try to do something with this firehose of information. Unfortunately I have yet to stumble over one that gives what I consider useful reports.
PS: You might be wondering why this is /proc/<PID>/mountstats
instead of a general file in /proc
or whatever. My understanding
is that the reason for this is that modern Linux systems can have
multiple filesystem namespaces and hence you have to talk about a
specific process's view of mounts. It's traditional to use
/proc/self/mountstats
because you can always read your own version
of it.
Sidebar: where in the kernel this stuff lives
Since there is no documentation, understanding what is being reported
here requires reading the kernel source. In current 3.12.0-rc3 code,
the overall report is produced by nfs_show_stats()
in
fs/nfs/super.c
. This is responsible for events:
and bytes:
;
the information that appears in them is discussed in the comments
for include/linux/nfs_iostat.h
.
The RPC information is produced by rpc_print_iostats()
in
net/sunrpc/stats.c
. The xptr:
line is produced by several
different functions in net/sunrpc/xprtsock.c
(which one is used
depends on the connection protocol) using information in a structure
that is described in include/linux/sunrpc/xprt.h
(note that these
struct fields are not printed in order, so you really do need to
read the code). The per-op statistics are produced using information
described in include/linux/sunrpc/metrics.h
but again you'll need
to read the source for the order and details.