Taking advantage of the Linux kernel NFS server's group membership cache
Yesterday I wrote about looking at and flushing the NFS server's
group membership cache, whose current
contents are visible in
the time I was simply thinking about how to manage it, but afterward
it struck me that since it can get reasonably large, the group
membership cache will tell you some potentially quite valuable
information. Specifically, the group membership cache will often
tell you who has used your NFS server recently.
Every time an NFS(v3) request comes in from a NFS client, the kernel
needs to know the group membership of the request's UID, which means
that the request's UID will acquire an entry in
As I've seen, this happens even for
UIDs that don't exist locally and so have no group membership; these
UIDs get entries of the form '
123 0:', instead of the regular
group count and group list. Meanwhile, UIDs that have not recently
made a request to your NFS server will have their
entry expire out after no more than 30 minutes from the last use.
If you just look at
auth.unix.gid/content in normal operation,
you're not quite guaranteed to see every recent user of your NFS
server; it could be that some active UID has just hit its 30 minute
expiry and is in the process of being refreshed. If you want to be
sure you know who's using NFS server, you can flush the group
membership cache, wait an appropriate amount of time (less than 30
minutes), and look; since you flushed the cache, you know that no
current entry is old enough to expire on you in this way.
(As you'd expect and want for an authentication cache, entries always expire 30 minutes from when they're added, regardless of whether or not they're still being used.)
Flushing the cache is also one way to see who's using your NFS server over a short timespan. If you flush the cache, wait 30 seconds, and look at the contents, you have a list of all of the UIDs that made NFS requests in the last 30 seconds. If you think you have a user who's hammering away on your NFS server but you're not sure who, this could give you valuable clues. I suspect that we're going to wind up using this at some point.
(On sufficiently modern kernels you could probably extract this information and much more through eBPF, probably using bpftrace (also). Unfortunately for us, Ubuntu 18.04 and bpftrace are not currently a good combination, at least not with only stock Ubuntu repos.)
PS: Contrary to what I assumed and wrote yesterday, there doesn't seem to be any particular
size limit for the NFS server's group request cache. Perhaps there's
some sort of memory pressure lurking somewhere, but I certainly
can't see any limit on the number of entries. This means that your
auth.unix.gid really should hold absolutely everyone
who's done NFS requests recently, especially after you flush the
cache to reset all of the entry expiry times.