Wandering Thoughts archives

2024-02-06

What the max_connect Linux NFS v4 mount parameter seems to do

Suppose, not hypothetically, that you've converted your fleet from using NFS v3 to using basic Unix security NFS v4 mounts when they mount their hordes of NFS filesystems from your NFS fileservers. When your NFS clients boot or at some other times, you notice that you're getting a bunch of copies of a new kernel message:

SUNRPC: reached max allowed number (1) did not add transport to server: <IP address>

Modern NFS uses TCP, which means that the NFS client needs to make some number of TCP connections to each NFS server. In NFS v3, Linux normally only makes one connection to each server. The same is sort of true in NFS v4 as well, but NFS v4 is more complex about what is 'a server'. In NFS v3, servers are identified by at least their IP address (and perhaps their name; I'm not sure if two different names that map to the same IP will share the same connection). In NFS v4.1+, servers have some sort of intrinsic identity that is visible to clients even if you're talking to them by multiple IP addresses.

This new 'reached max allowed number (<N>) did not add transport to server' kernel message is reporting about this case. You (we) have a single NFS server that for historical reasons has two different IPs, one for most of its filesystems and one for our central administrative filesystem, and now NFS v4 considers these the 'same' server and won't make an extra connection to the second IP.

You might wonder if you can change this, and the answer is that you can but it gets complex and I'm not quite sure how it all works to distribute the actual NFS traffic. There appear to be two interlinked things that you can control; how many connections a NFS v4 client will make to a single NFS server, and how many different IPs of the server that NFS v4 client will connect to. How many connections NFS v4 will make to a single server is mostly controlled by nfs(5)'s nconnect setting, sort of like nconnect's behavior with NFS v3. How many connections NFS v4 will make to separate client IPs is controlled by 'max_connect'. Both of these default to 1. However, how they interact is confusing and I'm not sure I fully understand it.

The easy case is not setting nconnect and setting max_connect to at least as many different IP aliases as you have for each fileserver. In this case you'll get one TCP connection per server IP (although don't ask me what traffic flows over what connection). If you set nconnect without max_connect, you'll get however many connections to the first IP address of each server (well, the first IP address that the client finds), assuming that you mount at least that many NFS filesystems from that server.

However, if you set both nconnect and max_connect, what seems to happen (on Ubuntu 22.04) is that you get nconnect TCP connections to each server's first (encountered) IP address, and then one TCP connection to every other IP address (up to the max_connect limit). This is why I described 'nconnect' as controlling how many connections NFS v4 would make to a single server, instead of a single server IP (or name). It would be a bit more useful if you could set nconnect on a per-IP (or name) basis in NFS v4, or otherwise make it so that the first IP didn't get all of the connections.

(This is apparently called 'trunking' in NFS v4, per RFC 5661 section 2.10.5 (via).)

NFSv4MaxConnectEffects written at 22:49:05;


Page tools: See As Normal.
Search:
Login: Password:

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.