An implementation difference in NSS netgroups between Linux and Solaris
NSS is the Name Service Switch, or as we
normally know it, /etc/nsswitch.conf
. The purpose of NSS is to
provide a flexible way for sysadmins to control how various things
are looked up, instead of hard-coding it. For flexibility and
simplicity, the traditional libc approach is to use loadable shared
objects to implement the various lookup methods that nsswitch.conf
supports. The core C library itself has no particular knowledge of
the files
or dns
nsswitch.conf lookup type; instead that's
implemented in a shared library such as libnss_files
.
(This is a traditional source of inconvenience when building software, because it usually makes it impossible to create a truly static binary that uses NSS-based functions. Those functions intrinsically want to parse nsswitch.conf and then load appropriate shared objects at runtime. Unfortunately this covers a number of important functions, such as looking up the IP addresses for hostnames.)
The general idea of NSS and the broad syntax of nsswitch.conf is portable between any number of Unixes, fundamentally because it's a good idea. The shared object implementation technique is reasonably common; it's used in at least Solaris and Linux, although I'm not sure about elsewhere. However, the actual API between the C library and the NSS lookups is not necessarily the same, not just in things like the names of functions and the parameters they get passed, but even in how operations are structured. As it happens we've seen an interesting example of this divergence in a fundamental way.
Because it comes from Sun, one of the traditional things that NSS
supports looking up is netgroup
membership, via getnetgrent() and friends.
In the Solaris implementation of NSS's API for NSS lookup types,
all of these netgroup calls are basically passed directly through
to your library. When a program calls innetgr()
, there is a whole
chain of NSS API things that will wind up calling your specific
handler function for this if you've set one. This handler function
can do unusual things if you want, which we use for our custom
NFS mount authorization.
We've looked at creating a similar NSS netgroup module for Linux
(more than once), but in the end
we determined it's fundamentally impossible because Linux implements
NSS netgroup lookups differently. Specifically, Linux NSS does not
make a direct call to your NSS module to do an innetgr()
lookup.
On Linux, NSS netgroup modules only implement the functions used
for getting the entire membership of a netgroup, and glibc implements
innetgr()
internally by looping through all the entries of a given
netgroup and checking each one. This reduces the API that NSS
netgroup modules have to implement but unfortunately makes our hack
impossible, because it relies on knowing which specific host you're
checking for netgroup membership.
At one level this is just an implementation choice (and a defensible
one in both directions). At another level, this says something about
how Solaris and Linux see netgroups and how they expect them to be
used. Solaris's implementation permits efficient network-based
innetgr()
checks, where you only have to transmit the host and
netgroup names to your <whatever> server and it may have pre-built
indexes for these lookups. The Linux version requires you to implement
a smaller API, but it relies on getting a list of all hosts in a
netgroup being a cheap operation. That's probably true today in
most environments, but it wasn't in the world where netgroups were
first created, which is why Solaris does things the way it does.
(Like NSS, netgroups come from Solaris. Well, they come from Sun; netgroups predate Solaris, as they're part of YP/NIS.)
|
|