The systemd dependency problem

March 10, 2013

When I wrote about what systemd got right I also mentioned in passing that it wasn't without flaws. It's time (and really past time) to start doing some elaboration on that, and I'm going to start with systemd's problem with documenting dependencies.

Systemd is fundamentally a dependency-based init mechanism; it starts things and orders startup based on what service needs what other service (among other things this determines what can start in parallel). All of this is well and good but it means that in order to write good systemd units you need documentation on how startup is structured, which is to say the standard dependencies involved. Your unit almost certainly can't run at arbitrary times, so in order to make it run at the right time (neither too early nor unnecessarily late) you need to know what to make it depend on. You may also need to know how to tell systemd that your unit provides a particular generic service like, for example, DNS lookups.

I'm going to be blunt: systemd in Fedora is falling down on this today. Part of this is systemd's fault and part of this is Fedora's fault (since at least part of how systemd units are ordered is up to the distribution and thus up to the distribution to document). Today it is fairly hard for a sysadmin who wants to write a good unit that works right to find out what they are supposed to depend on under what circumstances (and I speak from experience). Often it isn't obvious that you're missing something and that as a result your unit is only working through coincidence on your particular system (and may stop working right if something changes how systemd orders things).

(This problem is in fact so difficult that some Fedora-supplied packages get it wrong. For example their package for unbound does not assert that it provides DNS lookups, which means that even if other units properly say that they want DNS lookups they may well start before unbound does because systemd doesn't know any better.)

Systemd provides some documentation for this in the systemd.special(7) manpage but it's relatively sketchy and incomplete (from my perspective); it lists things but doesn't provide much guidance on how you want to structure your units and what you want to depend on in practice. Also, the section on depending on the network is actively enraging for many server administrators. It's very nice for systemd to tell developers that they should make their programs be nice and flexible in the face of networks appearing and disappearing, but they generally don't on servers and system administrators have to deal with many, many programs that have not been updated to how systemd wants them to behave. Worse, sometimes there is simply no rational way to do this sort of update.

(A related lack is that the systemd documentation does not clearly spell out how to tell it that your unit implements a particular target. Apparently the way to do this is to specify both Before=something.target and Wants=something.target; this is charmingly indirect, to put it one way.)

Related to this lack of documentation is a lack of tools for determining service dependency relations; systemd provides neither a 'what requires this' nor a 'what is required by this' query operation. Both are important in practice, especially if you're trying to audit your system to insure that its behavior is predictable and correct (ie, that you have the dependencies right so that everything deterministically starts when it should). Note that looking at the actual boot order is not sufficient for this because you don't know if the boot order is a product of actual dependencies or just how systemd decided to do things this time around.

(This collection of issues bit me in my recent upgrade to Fedora 18. Units that had been starting perfectly fine in Fedora 17 suddenly started not working; it turned out that they were missing dependencies and had only been working in Fedora 17 by coincidence. Trying to properly depend on DNS lookups being ready to go led me to discover the issue with unbound's own ordering.)

Related to this is the issue of missing dependencies. Systemd's selection of standard dependency targets and things that implement them is relatively sparse. Systemd provides neither tools nor good documentation for adding more, including targets that server administrators would like in practice. This is especially striking in the case of networking targets; if systemd is going to throw us to the wolves (as it does today), I would like it to provide some tools to at least help us implement our own meaningful targets (and yes, we're going to need them in practice). Even more useful would be standard fine-grained targets that systemd automatically notices and advertises (for example, 'IP address X has been assigned to an interface').

(I suspect that the best way to do this would be for systemd to support dependencies on DBus information, since I believe that such information is already broadcast across DBus for interested parties.)

Written on 10 March 2013.
« Why .rpmnew files are evidence of packaging failures
The easy way to wind up with multiple subnets on a single (V)LAN segment »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Mar 10 03:20:25 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.