Systemd needs official documentation on best practices

November 6, 2019

Systemd is reasonably well documented on the whole, although there are areas that are less well covered than others (some of them probably deliberately). For example, as far as I know everything you can put in a unit file is covered somewhere in the manpages. However, as was noted in the comments on my entry on how timer units can hide errors, much of this information is split across multiple places (eg, systemd.unit, systemd.service, systemd.exec, systemd.resource-control, and systemd.kill). This split is okay at one level, because the systemd manpages are explicitly reference documentation and the split makes perfect sense there; things that are common to all units are in systemd.unit, things that are common to running programs (wherever from) are in systemd.exec, and so on and so forth. Systemd even gives us an index, in systemd.directives, which is more than some documentation does.

But having reference documentation alone is not enough. Reference documentation tells you what you can do, but it doesn't tell you what you should do (and how you should do it). Systemd is a complex system with many interactions between its various options, and there are many ways to write systemd units that are bad ideas or that hide subtle (or not so subtle) catches and gotchas. We saw one of them yesterday, with using timer units to replace /etc/cron.d jobs. There is nothing in the current systemd documentation that will point out the potential drawbacks of doing this (although there is third party documentation if you stumble over it, cf).

This is why I say that systemd needs official documentation on best practices and how to do things. This would (or should) cover what you should do and not do when creating units, what the subtle issues you might not think about are, common mistakes people make in systemd units, and what sort of things you should think about when considering replacing traditional things like cron.d jobs with systemd specific things like timer units. Not having anything on best practices invites people to do things like the Certbot packagers have done, where on systemd systems errors from automatic Certbot renewal attempts mostly vanish instead of actually being clearly communicated to the administrator.

(You cannot expect people to carefully read all of the way through all of the systemd reference documentation and assemble a perfect picture of how their units will operate and what the implications of that are. That is simply too complex for people to keep full track of, and anyway people don't work that way outside of very rare circumstances.)

Written on 06 November 2019.
« Systemd timer units have the unfortunate practical effect of hiding errors
Realizing that Go constants are always materialized into values »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Nov 6 01:04:34 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.