Some thoughts on switching daemons to be socket activated via systemd

March 31, 2024

Socket activation is a systemd feature for network daemons where systemd is responsible for opening and monitoring the Internet or local socket for a daemon, and it only starts the actual daemon when a client connects. This behavior mimics the venerable inetd but with rather more sophistication and features. A number of Linux distributions are a little bit in love with switching various daemons over to being socket activated this way, from the traditional approach where the daemon handles listening for connections itself (along with the sockets involved). Sometimes this goes well, and sometimes it doesn't.

There are a number of advantages to having a service (a daemon) activated by systemd through socket activation instead of running all the time:

  • Services can simplify their startup ordering because their socket is ready (and other services can start trying to talk to it) before the daemon itself is ready. In fact, systemd can reliably know when a socket is 'ready' instead of having to guess when a service has gotten that far in its startup.

  • Heavy-weight daemons don't have to be started until they're actually needed. As a consequence, these daemons and their possibly slow startup don't delay the startup of the overall system.

  • The service (daemon) responsible for handling a particular socket can often be restarted or swapped around without clients having to care.

  • The daemon responsible for the service can shut down after a while if there's no activity, reducing resource usage on the host; since systemd still has the socket active, the service will just get restarted if there's a new client that wants to talk to it.

Socket activated daemons don't have to ever time out and exit on their own; they can hang around until restarted or explicitly stopped if they want to. But it's common to make them exit on their own after a timeout, since this is seen as a general benefit. Often this is actually convenient, especially on typical systems. For example, I believe many libvirt daemons exit if they're unused; on my Fedora workstations, this means they're often not running (I'm usually not running VMs on my desktops).

Apart from another systemd unit and the daemon having a deeper involvement with systemd, the downside of socket activation is that your daemon isn't immediately started and sometimes it may not be running. The advantage of daemons immediately starting on boot is that you know right away whether or not they could start, and if they're always running you don't have to worry about whether they'll restart under the system's current conditions (and perhaps some updated configuration settings). If the daemon has an expensive startup process, socket activation can mean that you have to wait for that on the first connection (or the first connection after things go idle), as systemd starts the daemon to handle your connection and the daemon goes through its startup.

Similarly, having the theoretical possibility for a daemon to exit if it's unused for long enough doesn't matter if it will never be unused for that long once it starts. For example, if a daemon has a deactivation timeout of two minutes of idleness and your system monitoring connects to it for a health check every 59 seconds, it's never going to time out (and it's going to be started very soon after the system boots, when the first post-boot health check happens).

PS: If you want to see all currently enabled systemd socket activations on your machine, you want 'systemctl list-sockets'. Most of them will be local (Unix) sockets.


Comments on this page:

By Alexis at 2024-03-31 23:08:58:

s6 takes a different approach to 'socket activation'. That said, someone has demonstrated some code that doesn't require pulling in libsystemd in the case of simple readiness notification (which the developer of s6 has also written about).

By Graham at 2024-04-01 12:48:43:

The advantage of daemons immediately starting on boot is that you know right away whether or not they could start

Another advantage is that the daemons may be more likely to start successfully—particularly if, like I, one avoids over-commit on one's systems. At system startup, resources such as RAM will almost certainly be available; whereas if socket activation triggers after days of uptime, who knows?

As a consequence, these daemons and their possibly slow startup don't delay the startup of the overall system.

I don't really agree it's "a consequence". Or that it avoids the delay; I guess that depends how one defines "startup", because in some sense this means the "startup" could complete some months after power-up, or not at all. But, alternately, a properly-sequenced and "nice"d immediate startup could give an equally useful result. It's generally more work to set up, but in principle could be automated by combining it with socket activation—e.g., the service manager opens HTTP and SSH sockets, and if a connection comes in during startup, the service being contacted gets moved earlier in the startup order.

The s6 project mentioned by Alexis looks interesting, and I'll need to look over it in more detail. For now, it's unclear to me why it suggests running s6-svscan as process 1, particularly given that the pages frequently criticize excess complexity in PID 1. Why should PID 1 be accepting commands and forking services? It seems like something any child-process could do just as well (though it might need PID 1 to tell it about dead children).

From 193.219.181.219 at 2024-04-01 12:51:04:

That said, someone has demonstrated some code that doesn't require pulling in libsystemd in the case of simple readiness notification

I don't know why this is described like a new discovery. "With great effort, our scientists have reverse-engineered the library and discovered how it works so that we may replace it..." Code for doing that has been around since the beginning, in many forms.

Before monolithic libsystemd, there was libsystemd-daemon et al.; and before libsystemd-daemon, there was a sd-daemon.c that developers were asked to copy and paste out of the systemd repo into their projects – or to re-implement in their own terms if they wanted. So the library was never required for this functionality.

(And the protocol has always been documented – in fact, there's a reason the environment variables used by sd_listen and sd_notify don't have "SYSTEMD_" in their name; it was specifically meant to have other implementations.)

By Graham at 2024-04-01 17:21:29:

"man 3 sd_notify" and "man 3 sd_listen_fds" document the protocols quite well, and they're almost entirely reasonable (the only real problem I have is that the failure of an FDSTORE=1 message can't be detected). For example, it's also really easy to receive the FD from socket notification: if the daemon hasn't stored any descriptors or been configured in some "interesting" way, it'll be FD 3 on daemon startup. It's probably best to check the environment variables, but one could get away with just using FD 3.

Honestly, re-implementing the code (the parts a daemon actually needs) isn't much harder that linking libsystemd or copying its source files into the daemon.

By Tom at 2024-04-01 17:55:49:

Note that systemd providing the socket for a service to listen on is a distinct thing from the service being socket activated (though the later depends on the former). You get this behaviour by enabling the service unit in addition to (or I think perhaps also instead of) the socket unit.

This can be used for a number of things

  1. listening on ports below 1024 without privileges.
  2. listening on a unix socket at a path the service doesn't have write access to, or with permissions the service can't set (e.g. making the socket only available to www-data for reverse-proxying, without the service having access to that user/group)
  3. as a more extreme example of the above, listening to a socket in a different network or mount namespace than the process (e.g. socket activation of podman containers)
  4. sandboxing a service so it doesn't have permission so it doesn't have permission to create a socket or connect to the network, but still be able to accept incoming connections.
By Alexis at 2024-04-01 21:46:53:

I don't know why this is described like a new discovery.

i certainly wasn't intending to describe it as a new discovery, and i didn't interpret the blog post that way either. i read it as more: "Look, people, you don't actually have to bring in all of libsystemd for this, it's actually pretty simple to write some suitable code."

That said, as someone who spends a lot of time writing and maintaining documentation, i know that people apparently don't like reading the documentation. :-)

By Alexis at 2024-04-01 22:14:18:

@Graham:

The s6 project mentioned by Alexis looks interesting, and I'll need to look over it in more detail. For now, it's unclear to me why it suggests running s6-svscan as process 1, particularly given that the pages frequently criticize excess complexity in PID 1. Why should PID 1 be accepting commands and forking services? It seems like something any child-process could do just as well (though it might need PID 1 to tell it about dead children).

You might find this page and this page helpful? They're part of the documentation of the s6-linux-init package. And also the documentation for s6-supervise - s6-svscan sets up a tree of s6-supervise processes that run the actual services as their direct children (an approach pioneered by Bernstein's daemontools), thus avoiding the PID files kludge.

But certainly s6-svscan doesn't need to run as PID 1; i'm slowly setting up s6+s6-rc to provide user service management on my OpenRC-based Gentoo box (because OpenRC doesn't yet provide user services, though that's gradually being worked on), where it definitely won't be running as PID 1.

When i was on Void, a few years back, i was using s6+66 for system init, service management, and service supervision, instead of runit - it worked well.

For a concrete example of use of s6 "in the wild", you might be interested in the s6-overlay project:

s6-overlay is an easy-to-install (just extract a tarball or two!) set of scripts and utilities allowing you to use existing Docker images while using s6 as a pid 1 for your container and process supervisor for your services.

Written on 31 March 2024.
« The Prometheus scrape interval mistake people keep making
The power of being able to query your servers for unpredictable things »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Mar 31 22:33:49 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.