Modern versions of systemd can cause an unmount storm during shutdowns
One of my discoveries about Ubuntu 20.04 is that my test machine
can trigger the kernel's out of memory killing during shutdown. My test
virtual machine has 4 GB of RAM and 1 GB of swap, but it also has
347 NFS mounts, and after some investigation, what appears to be
happening is that in the 20.04 version of systemd (systemd 245 plus
whatever changes Ubuntu has made), systemd now seems to try to run
umount
for all of those filesystems all at once (which also starts
a umount.nfs
process for each one). On 20.04, this is apparently
enough to OOM my test machine.
(My test machine has the same amount of RAM and swap as some of our production machines, although we're not running 20.04 on any of them.)
On the one hand, this is exactly what systemd said it was going to
do in general. Systemd will do as much in parallel as possible and
these NFS mounts are not nested inside each other, so they can all
be unmounted at once. On the other hand, this doesn't scale; there's
a certain point where running too many processes at once just
thrashes the machine to death even if it doesn't drive it out of
memory. And on the third hand, this doesn't happen to us on earlier
versions of Ubuntu LTS; either their version of systemd doesn't
start as many unmounts at once or their version of umount
and
umount.nfs
requires enough fewer resources that we can get away
with it.
Unfortunately, so far I haven't found a way to control this in
systemd. There appears to be no way to set limits on how many
unmounts systemd will try to do at once (or in general how many
units it will try to stop at once, even if that requires running
programs). Nor can we readily modify the mount units, because all
of our NFS mounts are done through shell scripts by directly calling
mount
; they don't exist in
/etc/fstab
or as actual .mount
units.
(One workaround would be to set up a new systemd unit that acts
before filesystems are unmounted and runs a 'umount -t nfs
',
because that doesn't try to do all of the unmounts at once. Getting
the ordering right may be a little bit tricky.)
|
|