Wandering Thoughts archives

2010-03-17

How Solaris 10's mountd works

Due to security and complexity issues, Unix systems vary somewhat in exactly how they handle the server side of doing NFS mounts. I've recently been digging in this area, and this is what I've learned about how it works in Solaris 10.

The server side of mounting things from a Solaris 10 fileserver goes more or less like this:

  • a client does the usual SUNRPC dance with portmapper and then sends an RPC mount request to mountd
  • if the filesystem is not exported at all or if the options in the mount request are not acceptable at all, mountd denies the request.

  • mountd checks to see if the client has appropriate permissions. This will probably include resolving the client's IP address to a hostname and may include netgroup lookups. This process looks only at ro= and rw= permissions, and thus will only do 'is host in netgroup' lookups for netgroups mentioned there.

  • if the client passes, mountd looks up the NFS filehandle of the root of what the client asked for and sends off an RPC reply, saying 'your mount request is approved and here is the NFS filehandle of the root of it'.

You'll notice that mountd has not told the kernel about the client having access rights for the filesystem.

  • at some time after the client kernel accepts the mount, it will perform its first NFS request to the fileserver. (Often this happens immediately.)

  • if the fileserver kernel does not have information about whether IP <X> is allowed to access filesystem <Y> in its authorization cache, it upcalls to mountd to check.

  • mountd goes through permissions checking again, with slightly different code; this time it also looks at any root= option and thus will do netgroup lookups for those netgroups too.
  • mountd replies to the kernel's upcall (we hope) with the permissions the client IP should have, which may be 'none'. The Solaris kernel puts this information in its authorization cache.

The mount daemon has a limit on how many simultaneous RPC mount requests it can be processing; this is 16 by default. There is some sort of limits on kernel upcalls, I believe including a timeout on how long the kernel will wait for any given upcall to finish before giving up, but I don't know what they are or how to find them in the OpenSolaris code.

Because this process involves doing the permissions checks twice (and checks multiple NFS export options), it may involve a bunch of duplicate netgroup lookups. Since netgroup lookups may be expensive, mountd caches the result of all 'is host <X> in netgroup <Z>' checks for 60 seconds, including negative results. This mountd cache is especially relevant for us given our custom NFS mount authorization.

(The combination of the kernel authorization cache with no timeout and this mountd netgroup lookup cache means that if you use netgroups for NFS access control, a single lookup failure (for whatever reason) may have wide-ranging effects if it happens at the wrong time. A glitch or two during a revalidation storm could give you a whole lot of basically permanent negative entries, as we've seen but not previously fully understood.)

Where to find OpenSolaris code for all this

I'm going to quote paths relative to usr/src, which is the (relative) directory where OpenSolaris puts all code in its repository.

The mountd source is in cmd/fs.d/nfs/mountd. Inside mountd:

  • the RPC mount handling code is in mountd.c:mount(). It checks NFS mount permissions as a side effect of calling the helpfully named getclientsflavors_new() or getclientsflavors_old() functions.
  • the kernel upcalls are handled by nfsauth.c:nfsauth_access(), which calls mountd.c:check_client() to do the actual permission checking.
  • the netgroup cache handling is done in netgroup.c:cache_check(), which is called from netgroup_check().

The kernel side of the upcall handling is in uts/common/fs/nfs, as mentioned earlier. The actual upcalling and cache management happens in nfs_auth.c:nfsauth_cache_get(), using Solaris doors as the IPC mechanism between mountd and the kernel.

solaris/SolarisMountdInnards written at 23:56:10; Add Comment

Another building block of my environment: rxterm

Like many sysadmins using Unix workstations, I spend a lot of time running xterms. Given that most of the time the remote X program I start with my rxexec script is an xterm, it's no surprise that I wrote another script to automate all of the magic involved, called rxterm.

Rxterm's basic job is to start an xterm on a remote system with all of the right options set for it; for instance, so that the xterm title and icon title have the name of the system that xterm is logging on. Like rxexec, rxterm has a number of options that are now vestigial and unused (but still complicate the code).

(Some people set the terminal window title in their prompt. I don't like that approach for various reasons.)

If this was all that rxterm did, it would be a very short script. However it has an additional option that complicates its life a lot: 'rxterm -r <host>' starts an xterm that is su'ing to root with my entire environment set up in advance (because you cannot combine xterm's -ls and -e arguments). Such xterms also get a special title and are red instead of my usual xterm colours.

Setting up my environment is fairly complex, because the things I need to do in the process of su'ing to root vary quite a lot from system to system. On some of them I can just go straight to su, but on others I need to run a cascade of scripts to get everything right. Rxterm has all of the knowledge of which system needs what approach, so I don't have to care. (Every now and then I need to tell it another exception.)

(In hindsight rxterm's approach to this problem is the wrong one, but that's something for another entry.)

Every so often I consider giving rxterm an option so that it will start a remote gnome-terminal instead of xterm. So far I keep not doing this because gnome-terminal's command line options are so different and the code isn't designed to cope with that, but by this point rxterm has so many historical remnants that I should probably rewrite it from scratch anyways.

(My short shameful confession here is that I had forgotten most of rxterm's arguments until I actually looked at the shell script in the process of writing this entry. Many probably don't work any more, and one actually has the comment 'Doesn't work any more? I lack the time to debug'.)

sysadmin/ToolsRxterm written at 02:51:32; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.