Wandering Thoughts archives

2018-09-25

A problem with unmounting FUSE mount points that are on NFS filesystems

If you have NFS filesystems with directories that are not world accessible, and people mount FUSE filesystems at spots under those directories, you are probably going to have problems in modern Linux kernels. Specifically, it is very difficult to unmount these FUSE mounts (even if the program providing the FUSE filesystem exits).

Here is what happens:

; ls -ld .; stat -f -c '%T' .
drwx--S--- 108 cks itdirgrp 242 Sep 25 22:55 ./
nfs
; sshfs cks@nosuchhost: demo
read: Connection reset by peer
fusermount: failed to chdir to /h/281/cks: Permission denied
; ls demo
ls: cannot access 'demo': Transport endpoint is not connected

[... become root ...]
# strace -e trace=umount2 /root/umount2 /h/281/cks/demo
umount2("/h/281/cks/demo", MNT_FORCE|MNT_DETACH) = -1 EACCES (Permission denied)

(umount2 here is a little program that does a umount2() on its argument; I'm using it to completely eliminate anything else that various programs are trying to do and failing at. fusermount and /bin/umount also fail, but in more elaborate ways.)

What appears to be happening here is that modern Linux kernels have decided that they will do a full lookup through the path you give them to unmount (I assume that they have good reasons for this). Since umount2() must be done as root, this path walking is done with root's permissions. On NFS mounts, UID 0 generally has no special privileges, so if the path to unmount through a restricted NFS directory, the kernel's traversal of the path will fail and the kernel will reject the umount2().

In an ideal world, the initial FUSE mount would have failed for the same reason, which would at least limit the damage. In this world, as we can see, the initial FUSE mount succeeds for some reason and you wind up with a stuck FUSE mount. This stuck FUSE mount will then block unmounting the NFS filesystem, because you can't unmount a filesystem that has a mount inside it.

There is a way around this but it requires a very special trick and I'm not certain it's going to work forever. The Linux kernel has an extra, non-portable notion of a 'filesystem user ID', which is the UID that the kernel uses for all accesses to the filesystem. With appropriate privileges, you can set this with setfsuid(2). The kernel uses the filesystem UID during the umount2() path walk, so if you have a program that setfsuid()s to the necessary target user and then calls umount2(), it will work (when run by root, since you need to be root to unmount things (or have the CAP_SYS_ADMIN capability, which is often pretty close to root)).

We now have a little program that does just this. However, we've decided that our real solution to this problem is to remove the Ubuntu 'fuse' package so people can't mount FUSE filesystems in the first place, because sshfs and its friends are not widely used here and we don't want to deal with the hassles.

(I was hoping to be able to just blacklist the FUSE kernel module, but Ubuntu builds FUSE directly into their kernels.)

PS: I really wish that FUSE filesystems were automatically unmounted when their transport endpoints died. Naturally this should be in the kernel and not involve path walking shenanigans, since as we've seen those can fail.

(This elaborates on some tweets of mine.)

FUSEOnNFSUnmounting written at 23:32:46; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.