A problem with unmounting FUSE mount points that are on NFS filesystems
If you have NFS filesystems with directories that are not world accessible, and people mount FUSE filesystems at spots under those directories, you are probably going to have problems in modern Linux kernels. Specifically, it is very difficult to unmount these FUSE mounts (even if the program providing the FUSE filesystem exits).
Here is what happens:
; ls -ld .; stat -f -c '%T' . drwx--S--- 108 cks itdirgrp 242 Sep 25 22:55 ./ nfs ; sshfs cks@nosuchhost: demo read: Connection reset by peer fusermount: failed to chdir to /h/281/cks: Permission denied ; ls demo ls: cannot access 'demo': Transport endpoint is not connected [... become root ...] # strace -e trace=umount2 /root/umount2 /h/281/cks/demo umount2("/h/281/cks/demo", MNT_FORCE|MNT_DETACH) = -1 EACCES (Permission denied)
(umount2
here is a little program that does a umount2()
on its
argument; I'm using it to completely eliminate anything else that
various programs are trying to do and failing at. fusermount
and /bin/umount
also fail, but in more elaborate ways.)
What appears to be happening here is that modern Linux kernels have
decided that they will do a full lookup through the path you give
them to unmount (I assume that they have good reasons for this).
Since umount2()
must be done as root, this path walking is done
with root's permissions. On NFS mounts, UID 0 generally has no
special privileges, so if the path to unmount through a restricted
NFS directory, the kernel's traversal of the path will fail and the
kernel will reject the umount2()
.
In an ideal world, the initial FUSE mount would have failed for the same reason, which would at least limit the damage. In this world, as we can see, the initial FUSE mount succeeds for some reason and you wind up with a stuck FUSE mount. This stuck FUSE mount will then block unmounting the NFS filesystem, because you can't unmount a filesystem that has a mount inside it.
There is a way around this but it requires a very special trick and
I'm not certain it's going to work forever. The Linux kernel has
an extra, non-portable notion of a 'filesystem user ID', which is
the UID that the kernel uses for all accesses to the filesystem.
With appropriate privileges, you can set this with setfsuid(2)
. The kernel
uses the filesystem UID during the umount2()
path walk, so if you
have a program that setfsuid()
s to the necessary target user and
then calls umount2()
, it will work (when run by root, since you
need to be root to unmount things (or have the CAP_SYS_ADMIN
capability, which is often pretty close to root)).
We now have a little program that does just this. However, we've decided that our real solution to this problem is to remove the Ubuntu 'fuse' package so people can't mount FUSE filesystems in the first place, because sshfs and its friends are not widely used here and we don't want to deal with the hassles.
(I was hoping to be able to just blacklist the FUSE kernel module, but Ubuntu builds FUSE directly into their kernels.)
PS: I really wish that FUSE filesystems were automatically unmounted when their transport endpoints died. Naturally this should be in the kernel and not involve path walking shenanigans, since as we've seen those can fail.
(This elaborates on some tweets of mine.)
|
|