Wandering Thoughts archives

2019-01-19

A surprise potential gotcha with sharenfs in ZFS on Linux

In Solaris and Illumos, the standard and well supported way to set and update NFS sharing options for ZFS filesystems is through the sharenfs ZFS filesystem property. ZFS on Linux sort of supports sharenfs, but it attempts to be compatible with Solaris and in practice that doesn't work well, partly because there are Solaris options that cannot be easily translated to Linux. When we faced this issue for our Linux ZFS fileservers, we decided that we would build an entirely separate system to handle NFS exports that directly invokes exportfs, which has worked well. This turns out to have been lucky, because there is an additional and somewhat subtle problem with how sharenfs is currently implemented in ZFS on Linux.

On both Illumos and Linux, ZFS actually implements sharenfs by calling the existing normal command to manipulate NFS exports; on Illumos this uses share_nfs and on Linux, exportfs. By itself this is not a problem and actually makes a lot of sense (especially since there's no official public API for this on either Linux or Illumos). On Linux, the specific functions involved are found in lib/libshare/nfs.c. When you initially share a NFS filesystem, ZFS will wind up running the following command for each client:

exportfs -i -o <options> <client>:<path>

When you entirely unshare a NFS filesystem, ZFS will wind up running:

exportfs -u <client>:<path>

The potential problem comes in when you change an existing sharenfs setting, either to modify what clients the filesystem is exported to or to alter what options you're exporting it with. ZFS on Linux implements this by entirely unexporting the filesystem to all clients, then re-exporting it with whatever options and to whatever clients your new sharenfs settings call for.

(The code for this is in nfs_update_shareopts() in lib/libshare/nfs.c.)

On the one hand this is a sensible if brute force implementation, and computing the difference in sharing (for both clients and options) and how to transform one to the other is not an easy problem. On the other hand, this means that clients that are actually doing NFS traffic during the time when you change sharenfs may be unlucky enough to try a NFS operation in the window of time between when the filesystem was unshared (to them) and when it was reshared (to them). If they hit this window, they'll get various forms of NFS permission denied messages, and with some clients this may produce highly undesirably consequences, such as libvirt guests having their root filesystems go read-only.

(The zfs-discuss re-query from Todd Pfaff today is what got several people to go digging and figure out this issue. I was one of them, but only because I rushed into exploring the code before reading the entire email thread.)

I would like to say that our system for ZFS NFS export permissions avoids this issue, but it has exactly the same problem. Rather than try to reconcile the current NFS export settings and the desired new ones, it just does a brute force 'exportfs -u' for all current clients and then reshares things. Fortunately we only very rarely change the NFS exports for a filesystem because we export to netgroups instead of individual clients, so adding and removing individual clients is almost entirely done by changing netgroup membership. The actual exportfs setting only has to change if we add or remove entire netgroups.

(Exportfs has a tempting '-r' option to just resynchronize everything, but our current system doesn't use it and I don't know why. I know that I poked around with exportfs when I was developing it but I don't seem to have written down notes about my exploration, so I don't know if I ran into problems with -r, didn't notice it, or had some other reason I rejected it. If I didn't overlook it, this is definitely a case where I should have documented why I wasn't doing an attractive thing.)

linux/ZFSOnLinuxSharenfsGotcha written at 00:08:23; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.