Systemd services that always restart should probably set a restart delay too

July 9, 2019

Ubuntu 18.04's package of the Prometheus host agent comes with a systemd .service unit that is set with 'Restart=always' (something that comes from the Debian package, cf). This is a perfectly sensible setting for the host agent for a metrics and monitoring system, because if you have it set to run at all, you almost always want it to be running all the time if at all possible. When we set up a local version of the host agent, I started with the Ubuntu .service file and kept this setting.

In practice, pretty much the only reason the Prometheus host agent aborts and exits on our machines is that the machine has run out of memory and everything is failing. When this happens with 'Restart=always' and the default systemd settings, systemd will wait its default of 100 milliseconds (the normal DefaultRestartSec value) and then try to restart the host agent again. Since the out of memory condition has probably not gone away in 100 ms, this restart is almost certain to fail. Systemd will repeat this until the restart has failed five times in ten seconds, and then, well, let me quote the documentation:

[...] Note that units which are configured for Restart= and which reach the start limit are not attempted to be restarted anymore; [...]

With the default restart interval, this takes approximately half a second. Our systems do not clear up out of memory situations in half a second, and so the net result was that when machines ran out of memory sufficiently badly that the host agent died, it was dead until we restarted it manually.

(I can't blame systemd for this, because it's doing exactly what we told it to do. It is just that what we told it to do isn't the right thing under the circumstances.)

The ideal thing to do would be to try restarting once or twice very rapidly, just in case the host agent died due to an internal error, and then to back off to much slower restarts, say once every 30 to 60 seconds, as we wait out the out of memory situation that is the most likely cause of problems. Unfortunately systemd only offers a single restart delay, so the necessary setting is the slower one; in the unlikely event that we trigger an internal error, we'll accept that the host agent has a delay before it comes back. As a result we've now revised our .service file to have 'RestartSec=50s' as well as 'Restart=always'.

(We don't need to disable StartLimitBurst's rate limiting, because systemd will never try to restart the host agent more than once in any ten second period.)

There are probably situations where the dominant reason for a service failing and needing to be restarted is an internal error, in which case an almost immediate restart minimizes downtime and is the right thing to do. But if that's not the case, then you definitely want to have enough of a delay to let the overall situation change. Otherwise, you might as well not set a 'Restart=' at all, because it's probably not going to work and will just run you into the (re)start limit.

My personal feeling is that most of the time, your services are not going to be falling over because of their own bugs, and as a result you should almost always set a RestartSec delay and consider what sort of (extended) restart limit you want to set, if any.

Sidebar: The other hazard of always restarting with a low delay

The other big reason for a service to fail to start is if you have an error in a configuration file or the command line (eg a bad argument or option). In this case, restarting in general does you no good (since the situation will only be cleared up with manual attention and changes), and immediately restarting will flood the system with futile restart attempts until systemd hits the rate limits and shuts things off.

It would be handy to be able to tell systemd that it should not restart the service if it immediately fails during a 'systemctl start', or at least to tell it that the failure of an ExecStartPre program should not trigger the restarting, only a failure of the main ExecStart program (since ExecStartPre is sometimes used to check configuration files and so on). Possibly systemd already behaves this way, but if so it's not documented.

Comments on this page:

By dozzie at 2019-07-10 05:09:18:

When I was writing a daemon supervisor (for my toolkit for building monitoring systems), I called this a "restart strategy" and it was a sequence of delays, e.g. 0s, 5s, 15s, 30s, 60s, with the last one being repeated indefinitely. There's a little more there in the restarts code, like when to go back to the first delay, all of which is heuristics, but it's for the exact same reason as you described here: immediate (or near immediate, 100ms) restarts will flood the logs and don't help for the usual problems when a daemon crashes.

Backoff for service restarts (exponential or otherwise) isn’t exactly a new insight… in fact good ol’ SysV init has it, if memory serves? So does systemd just not have it, at all…?

Yeah we do exponential backoff in our service startup by simply running a script right before the command to exec. We did it with upstart but I suspect doing it with systemd wouldn't be too hard:

Written on 09 July 2019.
« SMART drive self-tests seem potentially useful, but not too much
I brought our Django app up using Python 3 and it mostly just worked »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Jul 9 23:45:39 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.