How to see and flush the Linux kernel NFS server's group membership cache
One of the long standing limits with NFS v3 is that the protocol
only uses up to 16 groups. In order to
get around this and properly support people in more than 16 groups,
various Unixes have various fixes.
Linux has supported this for many years (since at least 2011)
if you run rpc.mountd
with -g
aka --manage-gids
. If you do
use this option, well, I'll just quote the rpc.mountd
manpage:
Accept requests from the kernel to map user id numbers into lists of group id numbers for use in access control. [...] If you use the
-g
flag, then the list of group ids received from the client will be replaced by a list of group ids determined by an appropriate lookup on the server. Note that the 'primary' group id is not affected so anewgroup
command on the client will still be effective. [...]
As this mentions, the 'appropriate lookup' is performed by rpc.mountd
when the kernel asks it to do one. As you'd expect, rpc.mountd
uses whatever normal group membership lookup methods are configured
on the NFS server in nsswitch.conf
(it just calls getpwuid(3)
and
getgrouplist(3)
in
mountd/cache.c).
As you might expect, the kernel maintains a cache of this group
membership information so that it doesn't have to flood rpc.mountd
with lookups of the same information (and slow down handling NFS
requests as it waits for answers), much like it maintains a client
authentication cache. The group
membership cache is handled with the same general mechanisms as
the client authentication cache,
which are sort of covered in the nfsd(7) manpage.
The group cache's various control files are found in
/proc/net/rpc/auth.unix.gid
, and they work the same as auth.unix.ip
.
There is a content
file that lets you see the currently cached
data, which comes in the form:
#uid cnt: gids... 915 11: 125 832 930 1010 1615 30062 30069 30151 30216 31061 31091
Occasionally you may see an entry like '123 0:
'. I believe that
this is generally an NFS request from a UID that wasn't known on
the NFS fileserver; since it wasn't know, it has no local groups
and so rpc.mountd
reported to the kernel that it's in no groups.
All entries have a TTL, which is unfortunately not reported in the
content
pseudo-file; rpc.mountd
uses its standard TTL of 30
minutes when adding entries and then they count down from there,
with the practical effect that anything you see will expire at some
unpredictable time within the next 30 minutes. You can flush all
entries by writing a future time in Unix seconds to the flush
file. For example:
date -d tomorrow +%s >auth.unix.gid/flush
This may be useful if you have added someone to a group, propagated the group update to your Linux NFS servers, and want them to immediately have NFS client access to files that are group-restricted to that group.
On sufficiently modern kernels, this behavior has been loosened
(for all flush
files of caches) so that writing any number at all
to flush
will flush the entire cache. This change was introduced
in early 2018 by Neil Brown, in this commit.
Based on its position in the history of the kernel tree, I believe
that this was first present in 4.17.0 (which unfortunately means
that it's a bit too late to be in our Ubuntu 18.04 NFS fileservers).
Presumably there is a size limit on how large the kernel's group
cache can be, but I don't know what it is. At the moment, there are
just over 550 entries in content
on our most broadly used Linux
NFS fileserver (it holds /var/mail
, so a lot of people access
things from it).
|
|