Systemd timer units don't have much appeal for us (over crontab entries)

November 7, 2021

I recently wrote about when out crontab entries run, and in a comment Joseph asked if we'd considered switching toward systemd timer units instead. The short answer is no; the longer answer is that unlike Joseph's experience, we don't think we'd find timer units to be a significant improvement over the old, well understood crontab approach.

So let's start with the good things about systemd timer units as compared to /etc/cron.d, which as I see it are two-fold. First, systemd timer units provide an easy way to execute the actual 'crontab' command on demand; since they require an accompanying unit to activate, you can just 'systemctl start <unit>' and be done. This beats looking at your /etc/cron.d entry to pull out the command, and as a bonus it automatically gets the user right, if you're running the command as a non-root user. Second, systemd timer units provide more flexible timing for periodic activities than crontab entries do (and can have built in randomization of start times, which is important across a fleet). Crontab entries need to divide relatively evenly into crontab units, and they don't go to finer grained units than minutes. If you want something activated every 90 seconds or every 13 minutes (for two examples), using cron is going to be painful.

The great advantage of crontab entries in a modern environment that supports /etc/cron.d is that they're much simpler to deploy and use. With crontab, you need one simple file that you put into a single directory and then everything is deployed and active, and you will automatically get mailed any (error) output so you know about problems. With systemd timer units, you need at least two more complex files (a .timer and a .service file), you need to run an additional command on deployment (to enable the timer), and being emailed anything when the timer fails is more complicated. In addition, both the systemd files have to go into an increasingly cluttered directory that's used for all sorts of assorted purposes (ie, /etc/systemd/system), while crontab entries go into a directory that's specifically for this purpose (/etc/cron.d) and so is easier to keep track of.

A single crontab file can also have multiple related activities in it, such as 'everything we do frequently on our ZFS fileservers'. With systemd timers, each timer unit can only control a single command, because it triggers exactly one other systemd unit. This can increase the number of files and deployment steps you need, and also opens the door to mistakes like a partially enabled set of timers (or, on the flipside, a partially disabled set of them; with cron.d, you can 'disable' everything in a group by removing the file).

For us, the systemd timer advantages aren't important. We rarely want to run crontab commands by hand, and we don't have anything we want to run that doesn't fit within the constraints of crontab scheduling. By contrast, we do get value from the simpler deployment for crontab entries in an /etc/cron.d environment (and also from /et/cron.hourly and /etc/cron.daily).

In theory, /etc/cron.d is somewhat more universal across Linux environments. In practice systemd has won these days so systemd timer units are as available on any Linux distribution we're likely to use. They're certainly as available on Ubuntu LTS, and in practice we are essentially an Ubuntu LTS shop these days.

(We still have some CentOS 7 systems, but since IBM took CentOS 8 out behind the barn and shot it, they will be replaced with Ubuntu machines in the future.)


Comments on this page:

By Joseph at 2021-11-08 06:40:28:

That’s quite interesting and for your uses cases is quite sensible.

My use case is a bit different and thus there are some systemd features that are particularly important to me. I primarily use systemd timers to do full and incremental backups of various database systems. These systems are business critical and they have enough data that backup might take 24 hours and well we want daily backups.

1. Logging - journald provides structured log output that’s timestamped and we can easily filter for that service.

2. Trivial to automate getting the status of a backup and the logs reporting this to metric and logging systems

3. Locking - if the service is already running systemd timer will not trigger the service again.

There are certainly ways you can achieve all of these things with cron but they tend to be custom bespoke solutions and likely less reliable. With systemd timers, I get these things for free and they all behave the same way on any Linux box, regardless of distribution.

By Vincent Bernat at 2021-11-09 13:21:15:

There is systemd-cron providing a systemd generator to replace your crontabs. This is a dropin replacement, but you get logging in journal for free. It also emails you in case of failures.

By cks at 2021-11-09 16:14:11:

One thing that probably helps make our use of cron feel easy is that we already have well evolved solutions around things like locking, because we started working with cron well before systemd and timer units were an option. If we were starting from scratch, we might not want to build all that infrastructure when timers provided at least some of it for us.

By Walex at 2021-11-09 17:28:44:

Some of the 'systemd' debate is based on a fundamental misunderstanding of 'systemd' and its deeply flawed design, and the CRON aspect of it is strongly related:

  • 'systemd' primarily goal was quick boot (to the point that originally and still in large part it does not handle shutdown that well).
  • To achieve quick boot is was designed for maximum parallelism of boot tasks.
  • To achieve maximum parallelism of boot tasks it was designed to handle dependencies among tasks rather than states and related events.
  • Because it is fundamentally based on dependencies among tasks and not events, it works best by supervising all tasks itself, including logins (rather than 'getty'), mounts (rather than AMD or 'autofs') and periodic actions (rather than CRON or 'at').
  • Because of the above it has a tendency to absorb everything else, including login, mount, periodic action functionality.

Note: in general process supervisors can be classified as "task" or "event" oriented, and "push" or "pull" based in logic. Note: a fatal defect of essentially all current process supervisors is that since there is no agreed model of service start/stop, as opposed to process fork/exit), even event based ones confuse process states with service states; the only significant unintentional exception is socket-based services (supervised by 'inetd'/'xinet' etc.) where socket state implies service state,

By Walex at 2021-11-09 17:32:27:

«even event based ones confuse process states with service states»

But this is just the usual/default case: event based supervisors could work on dependencies involving service states, if there was an agreed model of service states, and thus related events.

By Vincent Bernat at 2021-11-10 02:26:19:

« 'systemd' primarily goal was quick boot »

This is not the case. Speed was a side-effect of the design. See http://0pointer.de/blog/projects/the-biggest-myths.html. Also, correct ordering at shutdown is a more complex task as an incorrect ordering cannot be fixed by just retrying (and debugging is obviously more difficult).

By Walex at 2021-11-10 02:56:03:

'systemd' primarily goal was quick boot” «This is not the case. Speed was a side-effect of the design. See http://0pointer.de/blog/projects/the-biggest-myths.html.*»

Actually that page, even if it could be considered retconning, just says that it has not been made as efficient as possible as in “we never really sat down and optimized the last tiny bit of performance out of systemd” without reference to the shorter boot times. But the next item shows how important those short boot times are: “Myth: systemd's fast boot-up is irrelevant for servers”, even if the primary goal was shorter boot times for desktops.Otherwise the principle of maximum parallelism that pervades 'systemd' would make little sense.

Indeed the statement I made was too narrow: most of the alternative 'init' designs had as primary goal shorter boot times, because at one point Microsoft targeted short boot times on the desktop, to the point of starting the logon window before the system was ready to accept logons.

Regardless of the design intentions, anyhow the adoption of parallel booting was at least driven by the shortest boot times, as seen by several tools and competitions as to achieving the shortest boot times.

Note: one of the design defects of most new 'init' systems is the confusion between 'init' which has minimal process supervision duties (basically reaping) and service supervision functionality, which could well be separated with a minimal 'init' starting the traditional shell and that starting an independent service supervisor; but since the motivator was shorter boot times, it must have seemed natural to the authors of the newer 'init' systems to make that confusion.

«*correct ordering at shutdown is a more complex task as an incorrect ordering cannot be fixed by just retrying (and debugging is obviously more difficult)»

To me those are excuses: if the author, who is undoubtedly very smart, had given the slightest thought to actually doing things right, instead of just shorter boot times, he would have realized that the first task was to define a proper service state model and API, and a related service state machine, and that would have made doing the right thing at shutdown possible at least in principle. But the difference between process state and service state is smaller for startup than shutdown in most cases, and matters less to shorter boot times, and more for shutdown, and desktop users (and Microsoft) care a lot less about shorter or even just correct shutdown.

By Walex at 2021-11-10 03:52:59:

'systemd' primarily goal was quick boot

«This is not the case. Speed was a side-effect of the design. See http://0pointer.de/blog/projects/the-biggest-myths.html .»

Just to be really sure:

http://0pointer.de/blog/projects/systemd.html

Posted on Fr 30 April 2010 Rethinking PID 1” “*As mentioned, the central responsibility of an init system is to bring up userspace. And a good init system does that fast. Unfortunately, the traditional SysV init system was not particularly fast. For a fast and efficient boot-up two things are crucial:

  • To start less.
  • And to start more in parallel.

What does that mean? Starting less means starting fewer services or deferring the starting of services until they are actually needed.”

Note: the “the central responsibility of an init system is to bring up userspace” premise is already so wrong. Plus the absolute emphasis on “starting” as opposed to supervising, which includes the shutdown phase. Plus the pervasive confusion in that post between service and process supervision.

By cks at 2021-11-10 09:26:33:

I believe that before Linux had sufficient cgroup support, really good service supervision more or less required coordination with PID 1 due to PID 1 inheriting abandoned processes. If your service supervision is going to closely coordinate with PID 1 with a regular flow of messages between them, it might as well be PID 1.

(Service supervision without this visibility is rather less compelling. And if you want to supervise services that have not been rewritten into your new supervision world, you need this capability. There are hacks, but they lead to inefficiencies.)

Written on 07 November 2021.
« Thinking about when our crontab entries run
Soon to expire TLS certificates aren't necessarily a problem »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Sun Nov 7 23:49:29 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.