Even systemd services and dependencies are not self-documenting

October 10, 2018

I tweeted:

I'm sure that past-me had a good reason for configuring my Wireguard tunnel to only start during boot after the VMWare modules had been loaded. I just wish he'd written it down for present-me.

Systemd units are really easy to write, straightforward to read, and quite easy to hack on and modify. But, just like everything else in system administration, they aren't really self documenting. Systemd units will generally tell you clearly what they're doing, but they won't (and can't) tell you why you set them up that way, and one of the places where this can be very acute is in what their dependencies are. Sometimes those dependencies are entirely obvious, and sometimes they are sort of obvious and also sort of obviously superstitious. But sometimes, as in this case, they are outright mysterious, and then your future self (if no one else) is going to have a problem.

(Systemd dependencies are often superstitious because systemd still generally still lacks clear documentation for standard dependencies and 'depend on this if you want to be started only when <X> is ready'. Admittedly, some of this is because the systemd people disagree with everyone else about how to handle certain sorts of issues, like services that want to activate only when networking is nicely set up and the machine has all its configured static IP addresses or has acquired its IP address via DHCP.)

Dependencies are also dangerous for this because it is so easy to add another one. If you're in a hurry and you're slapping dependencies on in an attempt to get something to work right, this means that adding a comment to explain yourself adds proportionally much more work than it would if you already had to do a fair bit of work to add the dependency itself. Since it's so much extra work, it's that much more tempting to not write a comment explaining it, especially if you're in a hurry or can talk yourself into believing that it's obvious (or both). I'm going to have to be on the watch for this, and in general I should take more care to document my systemd dependency additions and other modifications in the future.

(This is one of the thing that version controlled configuration files are good for. Sooner or later you'll have to write a commit message for your change, and when you do hopefully you'll get pushed to explain it.)

As for this particular case, I believe that what happened is that I added the VMWare dependency back when I was having mysteries Wireguard issues on boot because, it eventually turned out, I had forgotten to set some important .service options. When I was working on the issue, one of my theories was that Wireguard was setting up its networking, then VMWare's own networking stuff was starting up and taking Wireguard's interface down because the VMWare code didn't recognize this 'wireguard' type interface. So I set a dependency so that Wireguard would start after VMWare, then when I found the real problem I never went back to remove the spurious dependency.

(I uncovered this issue today as part of trying to make my machine boot faster, which is partially achieved now.)

Written on 10 October 2018.
« Something systemd is missing for diagnosing oddly slow boots
Some notes on Prometheus's Blackbox exporter »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Oct 10 01:37:27 2018
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.