The pragmatic effects of setting nconnect
on NFS v3 mounts on Linux
After I wrote about how Linux NFS clients normally only make one
TCP connection to a given fileserver no matter how many NFS mounts
they have, people pointed out the
'nconnect
' for NFS mounts, as documented in nfs(5). Naturally I
wondered what the effects of setting this above one are (so in
theory one or more mounts uses multiple TCP connections), and
conveniently I have an environment where I can test this.
Suppose that you have a NFS fileserver that you mount a bunch of
filesystems from, and you set 'nconnect=2
' on all of those NFS
mounts. At the level of TCP connections, what you wind up with is
two TCP connections to the fileserver instead of one, each with its
own local port. If you then set nconnect to 3 for a few of those
filesystems but not all of them, you'll get a surprise; there are
still only two connections and /proc/mounts will say that those
filesystems are NFS mounted with 'nconnect=2' despite you providing
'nconnect=3' when they were mounted. In practice, all mounts from
a given server use the lowest nconnect
setting among all such
mounts, with one exception.
The exception is the case of 'nconnect=1', including not setting
nconnect
at all; this is bumped up to whatever the lowest explicit
nconnect
setting of 2 or more is. So explicitly setting nconnect
above one for a single mount from a fileserver also implicitly sets
it for all other mounts from that fileserver. All of this is
perhaps not surprising, since it's clear that the Linux kernel
NFS client likes to share TCP connections across mounts, and it
would cause a bunch of complexity if different mounts could be
using different numbers of connections.
(This means that you can easily create a persistent difference
between the NFS mount parameters you provided in your 'mount
'
command line or fstab options and the actual NFS mount options used
and reported in /proc/mounts. If you have an automated system that
attempts to keep these in sync (as we do in our automounter
replacement), it will be unhappy
with you.)
If you look at /proc/self/mountstats for such mounts, they will
have multiple xprt:
lines (which are per
connection, not per mount), one for each TCP
connection that they're actually using. You can tell the xprt:
lines apart because they include the port number (as the first
numeric field). This gives you a result that looks like this:
xprt: tcp 936 1 2 0 0 1011680 1011641 2 26304686 0 18 17214 373657 xprt: tcp 794 1 2 0 0 1014491 1014455 1 24511646 0 19 17574 370354
(This is 'RPC iostats version: 1.1' from an Ubuntu 22.04 machine, with the three new fields at the end, cf.)
As you can tell by how close the numbers are, the kernel multiplexes
NFS RPC across both connections more or less evenly under normal
circumstances. One unfortunate limitation about the xprt:
data
here is that while the kernel tells you RPC counts, it doesn't tell
you byte counts. To get that information you'll need to look at TCP
statistics for each connection with, for example, the ss
socket
stats program. In my brief testing, the bytes
sent and received at the TCP level are within the same range as
each other but not necessarily as even as the RCP numbers might
suggest. For the two connections above, the relevant TCP level stats
on the NFS client are:
tcp 936: bytes_acked:1716785029 bytes_received:415745308 tcp 794: bytes_acked:1657621341 bytes_received:426857280
Things that parse mountstats and want to be fully correct will need
to cope with multiple xprt:
lines for a given mount, probably to
aggregate the data together.
If you're using some sort of link aggregation that steers different TCP flows through different physical network paths, you may need to dig down into the TCP connection level stats to troubleshoot some problems since mountstats by itself won't give you enough information to see that, say, one connection is transmitting or receiving significantly more bytes than another.
|
|