A problem with unmounting FUSE mount points that are on NFS filesystems

September 25, 2018

If you have NFS filesystems with directories that are not world accessible, and people mount FUSE filesystems at spots under those directories, you are probably going to have problems in modern Linux kernels. Specifically, it is very difficult to unmount these FUSE mounts (even if the program providing the FUSE filesystem exits).

Here is what happens:

; ls -ld .; stat -f -c '%T' .
drwx--S--- 108 cks itdirgrp 242 Sep 25 22:55 ./
; sshfs cks@nosuchhost: demo
read: Connection reset by peer
fusermount: failed to chdir to /h/281/cks: Permission denied
; ls demo
ls: cannot access 'demo': Transport endpoint is not connected

[... become root ...]
# strace -e trace=umount2 /root/umount2 /h/281/cks/demo
umount2("/h/281/cks/demo", MNT_FORCE|MNT_DETACH) = -1 EACCES (Permission denied)

(umount2 here is a little program that does a umount2() on its argument; I'm using it to completely eliminate anything else that various programs are trying to do and failing at. fusermount and /bin/umount also fail, but in more elaborate ways.)

What appears to be happening here is that modern Linux kernels have decided that they will do a full lookup through the path you give them to unmount (I assume that they have good reasons for this). Since umount2() must be done as root, this path walking is done with root's permissions. On NFS mounts, UID 0 generally has no special privileges, so if the path to unmount through a restricted NFS directory, the kernel's traversal of the path will fail and the kernel will reject the umount2().

In an ideal world, the initial FUSE mount would have failed for the same reason, which would at least limit the damage. In this world, as we can see, the initial FUSE mount succeeds for some reason and you wind up with a stuck FUSE mount. This stuck FUSE mount will then block unmounting the NFS filesystem, because you can't unmount a filesystem that has a mount inside it.

There is a way around this but it requires a very special trick and I'm not certain it's going to work forever. The Linux kernel has an extra, non-portable notion of a 'filesystem user ID', which is the UID that the kernel uses for all accesses to the filesystem. With appropriate privileges, you can set this with setfsuid(2). The kernel uses the filesystem UID during the umount2() path walk, so if you have a program that setfsuid()s to the necessary target user and then calls umount2(), it will work (when run by root, since you need to be root to unmount things (or have the CAP_SYS_ADMIN capability, which is often pretty close to root)).

We now have a little program that does just this. However, we've decided that our real solution to this problem is to remove the Ubuntu 'fuse' package so people can't mount FUSE filesystems in the first place, because sshfs and its friends are not widely used here and we don't want to deal with the hassles.

(I was hoping to be able to just blacklist the FUSE kernel module, but Ubuntu builds FUSE directly into their kernels.)

PS: I really wish that FUSE filesystems were automatically unmounted when their transport endpoints died. Naturally this should be in the kernel and not involve path walking shenanigans, since as we've seen those can fail.

(This elaborates on some tweets of mine.)

Comments on this page:

From at 2018-09-26 00:46:58:

You can disable FUSE by adjusting the permissions of /dev/fuse via udev rules (indeed Debian used to limit it to the fuse group only).

I've heard that at least one of fuse.blacklist=1, module_blacklist=fuse, or initcall_blacklist=fuse_init are supposed to work even for built-in "modules" (all of them being interpreted by the kernel, not by userspace like modprobe.blacklist= is).

I often wish there was a way to set things up so that unmounting a filesystem also automatically unmounted any other filesystems mounted in subdirectories of it. (Probably by setting a flag on the sub-mount, though I'm open to persuasion that some other API is better.)

My main reason is that I like to set up filesystem mounts under my home directory. If my home dir is a non-permanent mount (e.g. on NFS, or encrypted) then that has to be done by my login script, which means my logout script has to unmount them all again, and if it fails for any reason or somehow doesn't run at all, then my home dir gets into a stuck state.

Whereas if I could set flags on all those submounts that said 'oh, and just automatically go away if you're still here when my home dir needs unmounting', that would be much more robust. And surely it would also solve your problem here.

lol. `fusermount` is absolutely doing setfsuid() dances. With code that seems horribly fragile to me. Maybe the dance goes wrong on the error path only.

You don't show `fusermount -u` as the user (i.e., does that not work either?).

modern Linux kernels have decided that they will do a full lookup through the path you give them to unmount (I assume that they have good reasons for this). Since umount2() must be done as root, this path walking is done with root's permissions. On NFS mounts, UID 0 generally has no special privileges

Well, hum. It has to identify the mount somehow. Naive string comparison doesn't work (because of overmounts, or the parent directory of the mountpoint can be renamed). So it's the simplest way.

It would be nice if `umount2()` could bypass netfs/fuse/etc revalidation and specific permissions though! I think the deentries on the path to mount points are all pinned in memory. Seems like it should be possible, it's just that it would be an extra special case to maintain.

By cks at 2018-09-26 10:23:53:

I think sshfs is doing 'fusermount -u' when its setup fails, hence the message, but yes, an explicit one fails with exactly the same error message. As far as fusermount's use of setfsuid() goes, it looks like it only does it for some operations, which don't include a crucial code path where it tries to chdir to the parent of the mount during the unmount process.

Written on 25 September 2018.
« Why I don't set master passwords in programs
Learning about Go's unaddressable values and slicing »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Sep 25 23:32:46 2018
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.