mountd caches netgroup lookups (relatively briefly)
Last time I covered how the Illumos NFS server caches filesystem
access permissions. However, this is not
the only level of caching that's possibly going on in the overall
NFS server ecosystem, because the Illumos NFS kernel ultimately
calls up to
mountd to find out about permissions and
have its own caching.
mountd caches netgroup membership checks for 60
seconds. Well, sort of. What it really caches is the result of
whether a host is in a specific list of netgroups, not whether or
not a host is in any particular netgroup. This may sound like a
silly distinction, but consider a NFS export (in ZFS format)
This export will always generate two cache entries, one for the
rw= set of two groups and one for the
root= single group. This
is true even if a host is in
group1 (and so gets a positive entry
in both entries). On the one hand, this probably doesn't matter too
much, as the cache has no size limits. On the other hand, the cache
is also a simple linked list, so let's hope it never grows too big.
(As you might guess from this, the cache is pretty brute force. That's probably okay.)
mountd and thus this netgroup cache gets involved in
two different situations. First you'll have
the actual NFS mount request itself from the client, which will go
straight to mountd, check the exports, and return appropriate
information to the client. Then when the client tries to actually
do an NFS operation with its shiny new mount, the kernel may or
perhaps will upcall back to mountd for another permission check.
This matters to us because of our custom NFS mount authorization scheme, which does its magic by hooking into netgroup lookups. Both negative and positive caching in mountd are a potential problem for us, although negative caching is usually worse since it means that a host with a verification glitch now has to wait roughly a minute before it can usefully retry a mount request. At the same time, some caching is definitely useful; as the comment in the source code says, mount requests often come in close bursts from the same machine (as it mounts a whole bunch of filesystems with the same export permissions), and only doing expensive things once for that burst is a clear win.
(Interested parties who want to see this particular sausage being made can look in the relevant Illumos source code. It looks like this code hasn't changed for a very long time.)