Sorting out my systemd mistake with a script-based service unit
Back in November I wrote about a systemd mistake I made with a script-based service unit, where I left out some service options and got a surprise when my service didn't work. A commentator recently made me realize that I didn't really understand what was going on and what had happened; instead I was working by superstition. So I've now done some experiments and read the systemd.service manpage again, and here's what I know.
The basic situation was that I wrote a
.service file that had
just this, where
ExecStop are scripts that just
run briefly and then exit:
[Service] WorkingDirectory=/var/local/wireguard ExecStart=/var/local/wireguard/startup ExecStop=/var/local/wireguard/stop Environment=LANG=C
(In this situation, systemd's defaults are that there is an implicit
Type=simple and the default
If you don't have
RemainAfterExit and your
ExecStart exits with
status 0, your service becomes inactive (as opposed to failing to
start). If you have an
ExecStop, systemd will then run it, even
though you haven't explicitly asked for a 'stop' operation; in my
situation this mysteriously reversed the effects of my start script.
That this happens is unfortunately not clearly documented anywhere
that I could see, although it makes a certain amount of sense if
ExecStop to be for cleanup actions, where you often
want the cleanup actions to happen if the service started successfully
and then stops, regardless of just why the service stopped.
(Looking through the stock Fedora 27 systemd
quite a lot of the
ExecStop actions appear to be this sort of
cleanup, not 'signal the service to shut down' actions.)
It's easy to see this with a test service that just runs some
scripts. You'll get output from '
systemctl status yourtest.service'
that looks like this:
Active: inactive (dead) since Mon 2018-04-02 15:37:19 EDT; 41s ago Process: 973 ExecStop=/root/stop-script (code=exited, status=0/SUCCESS) Process: 964 ExecStart=/root/start-script (code=exited, status=0/SUCCESS) Main PID: 964 (code=exited, status=0/SUCCESS)
ExecStart script ran, was considered the main PID, exited
with status 0, and then shortly afterward the
ExecStop script was
run (since it has a PID only a bit higher, and the start script ran
a couple of commands).
Contrary to what I thought in my first entry, the
Type=oneshot doesn't affect
this as such. As my commentator noted, what
Type=simple really affects is when other units will get started.
If you have
test.service with the implicit
another service says '
After=test.service', your other service
will get started the moment that systemd has started running
ExecStart. This is often not what you want;
instead you want things that depend on
test.service to only start
ExecStart has finished preparing things and exited.
Type=oneshot enforces, by making it so that
is only considered 'started' when your
ExecStart program or script
exits. Systemd does more or less document this, at the end of the
If set to
simple[...], as systemd will immediately proceed starting follow-up units.
oneshotis similar to
simple; however, it is expected that the process has to exit before systemd starts follow-up units. [...]
(This is not particularly clearly written, unfortunately. Energetic people can propose a documentation patch in the master repo.)
As the documentation notes,
Type=oneshot probably mostly requires
that you use
RemainAfterExit=yes, because otherwise the service
won't be considered to be active. Certainly things will be much
less confusing if you use it, because then all of the units involved
will stay 'active' and you won't ever have the experience of wondering
why something is up and running despite a dependency having failed.
After= doesn't actually create a dependency, of course, just an
ordering. But that's another entry.)