2019-07-09
Systemd services that always restart should probably set a restart delay too
Ubuntu 18.04's package of the Prometheus host agent comes with a systemd
.service unit that is set with 'Restart=always
'
(something that comes from the Debian package, cf).
This is a perfectly sensible setting for the host agent for a metrics
and monitoring system, because if you have it set to run at all,
you almost always want it to be running all the time if at all
possible. When we set up a local version of the host agent, I started
with the Ubuntu .service file and kept this setting.
In practice, pretty much the only reason the Prometheus host agent
aborts and exits on our machines is that the machine has run out
of memory and everything is failing. When this happens with 'Restart=always
'
and the default systemd settings, systemd will wait its default of
100 milliseconds (the normal DefaultRestartSec value) and then try
to restart the host agent again. Since the out of memory condition
has probably not gone away in 100 ms, this restart is almost certain
to fail. Systemd will repeat this until the restart has failed five
times in ten seconds, and then, well, let me quote the documentation:
[...] Note that units which are configured for Restart= and which reach the start limit are not attempted to be restarted anymore; [...]
With the default restart interval, this takes approximately half a second. Our systems do not clear up out of memory situations in half a second, and so the net result was that when machines ran out of memory sufficiently badly that the host agent died, it was dead until we restarted it manually.
(I can't blame systemd for this, because it's doing exactly what we told it to do. It is just that what we told it to do isn't the right thing under the circumstances.)
The ideal thing to do would be to try restarting once or twice very
rapidly, just in case the host agent died due to an internal error,
and then to back off to much slower restarts, say once every 30 to
60 seconds, as we wait out the out of memory situation that is the
most likely cause of problems. Unfortunately systemd only offers a
single restart delay, so the necessary setting is the slower one;
in the unlikely event that we trigger an internal error, we'll
accept that the host agent has a delay before it comes back. As a
result we've now revised our .service file to have 'RestartSec=50s
'
as well as 'Restart=always
'.
(We don't need to disable StartLimitBurst's rate limiting, because systemd will never try to restart the host agent more than once in any ten second period.)
There are probably situations where the dominant reason for a service
failing and needing to be restarted is an internal error, in which
case an almost immediate restart minimizes downtime and is the right
thing to do. But if that's not the case, then you definitely want
to have enough of a delay to let the overall situation change.
Otherwise, you might as well not set a 'Restart=
' at all, because
it's probably not going to work and will just run you into the
(re)start limit.
My personal feeling is that most of the time, your services are not
going to be falling over because of their own bugs, and as a result
you should almost always set a RestartSec
delay and consider what
sort of (extended) restart limit you want to set, if any.
Sidebar: The other hazard of always restarting with a low delay
The other big reason for a service to fail to start is if you have an error in a configuration file or the command line (eg a bad argument or option). In this case, restarting in general does you no good (since the situation will only be cleared up with manual attention and changes), and immediately restarting will flood the system with futile restart attempts until systemd hits the rate limits and shuts things off.
It would be handy to be able to tell systemd that it should not restart
the service if it immediately fails during a 'systemctl start
', or at
least to tell it that the failure of an ExecStartPre
program should
not trigger the restarting, only a failure of the main ExecStart
program (since ExecStartPre
is sometimes used to check configuration
files and so on). Possibly systemd already behaves this way, but if so
it's not documented.