2019-09-27
Some field notes on imposing memory resource limits on users on Ubuntu 18.04
As I mentioned in my entry on how we implement per-user CPU and memory limits, we have a number of shared general use servers where we've decided we need to impose limits on everyone all of the time so no one person can blow up the machine. Over the course of doing this, we've built up some practical experience and discovered a surprise or two.
As discussed, we impose our memory
limits by setting systemd's MemoryLimit
.
In theory perhaps we should use MemoryMax
,
but for two issues. First, our scripts still need to work on Ubuntu
16.04, where MemoryMax
isn't supported. Second, it's not clear
if systemd makes this work if you're not using unified cgroups
(cgroup v2),
and the Ubuntu 18.04 default is to use the original v1 cgroups
instead of the new cgroups. Since my impression is that there are
still assorted issues with v2 cgroups, we're not inclined to switch
away from the Ubuntu default here.
As documented, systemd's MemoryLimit
sets the memory.limit_in_bytes
cgroup attribute, which is sort of documented in the kernel's
memory.txt.
The important thing to know, which is only implicitly discussed in
memory.txt, is that this only limits the amount of RAM that
you can use, not the amount of swap space. In the Ubuntu 18.04
configuration of cgroup v1, there is simply no way to limit swap
space usage, and on top of that systemd doesn't expose the property
that you'd need.
Our experience is that this doesn't seem to matter for processes
that use a lot of memory very rapidly; they run into their user's
MemoryLimit
almost immediately without causing swap thrashing and
get killed by the cgroups OOM killer. However,
processes that slowly grow in memory usage over time will wind up
pushing things out to swap, themselves included, and as a result
their actual memory usage can significantly exceed your MemoryLimit
setting if you have enough swap. So far, we haven't experienced
swap thrashing as a result of this, but I suspect that it's possible.
Obviously, how much swap space you have strongly affects how much
total memory a user can use before the cgroups OOM killer triggers.
All of this can make your memory limit much more generous than you
expect.
(We normally don't configure much swap on our servers, but a few have several gigabytes of it for various reasons. And even with only one GB of swap, that might be close to a GB more of 'memory' usage than you may have expected.)
PS: I was going to say that fast-growing processes don't seem to swap much, but our Prometheus system stats suggest that that's wrong and we do see significant and rapid swap usage. Since much of our swap is on SSDs these days, I suppose that I shouldn't be too impressed with how fast our systems can write it out; a GB or three over a minute is not all that fast in today's world, and SSDs are very good at random IO.
Sidebar: What I expect us to set with systemd v2 cgroups
If Ubuntu switches to v2 cgroups by default, I currently think we'd
set a per-user MemorySwapMax
that was at most a GB or half our
swap space, whichever was smaller, make our current MemoryLimit
be MemoryMax
, and set MemoryHigh
to a value a GB or so lower
than MemoryMax
. The thing I'm least certain about is what we'd
want to set the swap limit to.
A file permissions and general deployment annoyance with Certbot
The more we use Certbot, the more I become convinced that it isn't written by people who actually operate it in anything like the kind of environment that we do (and perhaps not at all, although I hope that the EFF uses it for their own web serving). I say this because while Certbot works, there are all sorts of little awkward bits around the edges in practical operation (eg). Today's particular issue is a two part issue concerning file permissions on TLS certificates and keys (and this can turn into a general deployment issue).
Certbot stores all of your TLS certificate information under
/etc/letsencrypt/live
, which is normally owned by root and is
root-only (Unix mode 0700). Well, actually, that's false, because
normally the contents of that directory hierarchy are only symlinks
to /etc/letsencrypt/archive
, which is also owned by root and
root-only. This works fine for daemons that read TLS certificate
material as root, but not all daemons do; in particular, Exim reads
them as the Exim user and group.
The first issue is that Certbot adds an extra level of permissions
to TLS private keys. As covered by Certbot's documentation, from
Certbot version 0.29.0, private keys for certificates are specifically
root-only. This means that you can't give Exim access to the TLS
keys it needs just by chgrp'ing /etc/letsencrypt/live
and
/etc/letsencrypt/archive
to the Exim group and then making them
mode 0750; you must also specifically chgrp and chmod the private
key files. This can be automated with a deploy hook script, which
will be run when certificates are renewed.
(Documentation for deploy hooks is hidden away in the discussion of renewing certificates.)
The second issue is that deploy hooks do exactly and only what they're documented to do, which means that deploy hooks do not run the first time you get a certificate. After all, the first time is not a renewal, and Certbot said specifically that deploy hooks run on renewal, not 'any time a certificate is issued'. This means that all of your deployment automation, including changing TLS private key permissions so that your daemons can access the keys, won't happen when you get your initial certificate. You get to do it all by hand.
(You can't easily do it by running your deployment script by hand, because your deployment script is probably counting on various environment variables that Certbot sets.)
We currently get out of this by doing the chgrp and chmod by hand when we get our initial TLS certificates; this adds an extra manual step to initial host setup and conversions to Certbot, which is annoying. If we had more intricate deployment, I think we would have to force an immediate renewal after the TLS certificate had been issued, and to avoid potentially running into rate limits we might want to make our first TLS certificate be a test certificate. Conveniently, there are already other reasons to do this.