2017-08-03
Imposing temporary CPU and memory resource limits on a user on Ubuntu 16.04
Suppose, not entirely hypothetically, that you sometimes have users
on your primary login server who accidentally run big CPU-consuming
and memory-eating compute jobs that will adversely impact the
machine. You could kill their process or their entire login session,
but that's both a drastic impact and potentially not a sure cure,
and life gets complicated if they're running something involving
multiple processes. In an ideal world you would probably want to
configure this shared login server so that all users are confined
with reasonable per-user resource limits. Unfortunately systemd
cannot do that today; you need to put limits on the user-${UID}.slice
unit that systemd creates for each user, but you can't template
this unit to add
your own settings.
Without always-on per-user resource limits, what you'd like to do
is impose per-user resource limits on your runaway user on the fly,
so that they can't use more than, say, half the CPUs or three
quarters of the memory or the like (pick your own reasonable limits).
Systemd can do this, in a way similar to using systemd-run
to
limit something's RAM consumption, but
on Ubuntu 16.04 this requires a little bit more work than you would
expect.
The basic approach is to set limits on the user's user-${UID}.slice
slice unit:
systemctl --runtime set-property user-915.slice CPUQuota=200% MemoryLimit=8G
With --runtime
, these limits will not persist over the next reboot
(although that may be quite some time in the future, depending on
how you manage your machines; ours tend to be up for quite a while).
In theory this should be all that you need to do. In practice, on
Ubuntu 16.04 the first problem is that this will limit new login
sessions for the user but not existing ones. Of course existing
ones are the ones that you care about right now, because the user
is already logged on and already running those CPU-eaters. The
problem appears to be that just setting these properties does not
turn on CPUAccounting
and MemoryAccounting
for existing sessions,
so nothing is actually enforcing those limits.
The obvious fix here is to explicitly turn on these for the
user-${UID}.slice
unit we already manipulated. Sadly this
has no effect. Instead the magic fix appears to be to find one
of the user's scopes (use 'systemctl status <PID>
' for one
of the CPU-eating processes) and then set the limit on that
scope:
systemctl --runtime set-property scope-c178012.scope CPUAccounting=true MemoryAccounting=true
In my testing, the moment that I set these on for any current scope, all of the user's current login sessions were affected. If I sort of understand what systemd is doing with cgroups, this is probably because setting these on a single scope causes (or forces) systemd to ripple this up to parent units. Taken from the systemd.resource-control manpage:
Note that turning on CPU accounting for one unit will also implicitly turn it on for all units contained in the same slice and for all its parent slices and the units contained therein.
It's possible that this will turn on global per-user fair share scheduling all by itself. This is probably not such a bad thing on the kind of shared login server where we'd want to do this.
If you think you're going to need to add these on the fly limits, an obvious thing to do is to pre-enable CPU and memory accounting, so that all user slices and login scopes will be created ready for you to add limits. The basic idea works, but several ways to achieve it do not, despite looking like they should. What appears to be the requirement in Ubuntu 16.04 is that you force systemd to adjust its current in-memory configuration. The most straightforward way is this:
systemctl --runtime set-property user.slice CPUAccounting=true MemoryAccounting=true
Doing this works, but it definitely has the side effect that it turns on per-user fair share CPU scheduling. Hopefully this is a feature for you (it probably is for us).
The following two methods don't work, or at least they don't persist over reboots (they may initially appear to work because they're also causing systemd to adjust its current in-memory configuration):
- Enabling
DefaultCPUAccounting
andDefaultMemoryAccounting
inuser.conf
via a file in/etc/systemd/user.conf.d
, contrary to how I thought you'd set up per-user fair share scheduling. There is no obvious reason why this shouldn't work and it's even documented as working, it just doesn't in the Ubuntu 16.04 version of systemd (nominally version 229). If you do 'systemctl daemon-reload
' they may initially appear to work, but if you reboot they will quietly do nothing. - Permanently enabling
CPUAccounting
andMemoryAccounting
onuser.slice
with, for example, 'systemctl set-property user.slice CPUAccounting=true MemoryAccounting=true
'. This will create some files in/etc/systemd/system/user.slice.d
, but much like theuser.conf
change, they will do nothing after a reboot.
I can only assume that this is a systemd bug, but I don't expect it to ever be fixed in Ubuntu 16.04's version and I have no idea if it's fixed in upstream systemd (and I have little capability to report a bug, given the version number issue covered here).
(There is presumably some sign in /sys/fs/cgroup/*
to show whether
per-user fair share CPU scheduling is on or off, but I have no idea
what it might be. Alternately, if the presence of user-${UID}.slice
directories in /sys/fs/cgroup/cpu,cpuacct/user.slice
means that
per-user fair share scheduling is on, it's somehow wound up being
turned on on quite a few of our machines.)
In general I've wound up feeling that this area of systemd is badly underdocumented. All of the manpage stuff appears to be written for people who already understand everything that systemd is doing internally to manage resource limits, or at least for people who understand much more of how systemd operates here than I do.