vm.admin_reserve_kbytes sysctl is both not big enough and not sufficient
We enable Linux's strict overcommit on some
of our servers (mostly compute servers). Every so often people run
big enough programs that they run the machine out of memory, and
some of the time when this happens we get various plaintive reports
from cron and other things that periodic system processes have
failed with out of memory errors. The Linux kernel has a sysctl
that's supposed to help with this,
(documented in vm.txt), but
in practice we've found two issues.
The first is that the default value of admin_reserve_kbytes is set for systems not operating in strict overcommit mode, and in any case the value dates from 2013. The kernel's own documentation suggests turning this up to 128 MB for strict overcommit, but I suspect that that's not sufficient for modern programs (a brief check suggests the total virtual size is at least 190 MB or so for sshd, bash, and top on 64-bit x86 Ubuntu 18.04; their combined RSS is over 16 Mbytes). Perhaps 256 Mbytes could be enough in strict overcommit mode. In any case, we need to tune this up and it's hard to know by how much to make sure that cron jobs still keep running while not taking too much memory away from people, especially on machines with modest amounts of memory.
(If we were serious about this, we should look into collecting some sort of memory usage information from cron jobs on at least a test machine. As it is, this is a sufficiently infrequent issue that we don't care enough to do that work.)
The second is that often, no setting of admin_reserve_kbytes
will let you log in to a server that's in memory overcommit,
because of what I could call the DBus daemon problem.
Specifically, during login, parts of the SSH server run as
non-privileged users. As deliberately unprivileged UIDs, memory
allocations made by these processes are not covered by
admin_reserve_kbytes. If ordinary users can't allocate memory,
you're almost certainly not going to be able to ssh in even as root.
If you could get the SSH daemon to authenticate you, your eventual
bash processes would be covered by
admin_reserve_kbytes, but sadly you need that authentication
to happen before you get there.
(Turning off sshd's privilege separation is a cure far worse than the disease.)
The second issue lowers my motivation to try to fix the first problem by finding setting of admin_reserve_kbytes so that our administrative cron jobs reliably keep working. If a machine runs out of memory and stays there, we may not be able to get in to deal with whatever the problem is and things other than cron jobs may run into issues (we've seen the DBus daemon have problems in the past). Plus, our machines almost never run out of memory to the extent that we get cron email complaints about it.
PS: Someday our Ubuntu LTS machines may run systemd-oomd, which will undoubtedly need its own configuration and tuning. This might even show up in the future Ubuntu 22.04 LTS, which is not all that far away.