Why systemd is winning the init wars and other things aren't

February 11, 2014

Recently, an article by Rich Felker called Broken by design: systemd has been making the rounds. I have a number of things to say about this article but today I want to talk about one specific issue it brings up, which is systemd's novelty (or lack thereof) and why it is succeeding. To start with, here is the relevant quote from Felker's article:

None of the things systemd "does right" are at all revolutionary. They've been done many times before. DJB's daemontools, runit, and Supervisor, among others, have solved the "legacy init is broken" problem over and over again (though each with some of their own flaws). Their failure to displace legacy sysvinit in major distributions had nothing to do with whether they solved the problem, and everything to do with marketing. [...]

This is wrong on several levels. To start with and as usual, social problems are the real problems. In specific, none of these alternate init systems did the hard work to actually become a replacement init system for anything much. Anyone can write an init system, especially a partial one (I did once, long ago). Getting it adopted by people is the hard part and none of these alternatives tackled that effectively (if they did so at all, and some of them certainly didn't). And as Felker admits, each of these theoretical alternatives have flaws of their own.

(Note that this is not a criticism of those alternate init systems. I don't think any of them have really been developed with replacing SysV init in Linux distributions or elsewhere as a goal. DJB daemontools certainly wasn't; I believe that DJB's attitude towards it, as towards more or less everything he's developed, can be summed up as 'I showed you the way, what you do with it is up to you'.)

The reason systemd has succeeded in becoming an SysV init replacement is simple: it did the work. Not only did it put together a lot of good ideas regardless of their novelty or lack thereof but its developers put in the time and effort to convince people that it was a good idea, the right answer, a good solution to problems and so on. Then they dealt with lots and lots of practical concerns, backwards compatibility, corner cases, endless arguments, and so on and so forth. I want to specifically mention here that one of the things the systemd people did was write extensive documentation on systemd's design, how to configure and operate it, and what sorts of neat things you can do with it. While this documentation is not perfect, most init systems are an order of magnitude less well documented.

(I am sure that in some quarters it's popular to believe that Lennart Poettering bulldozed the Fedora technical people into adopting his new thing. I do not think that the Fedora technical people are that easily overrun (or that impressed by Poettering, especially after PulseAudio), and for that matter at least some of the Debian technical people feel that systemd is the best option despite having looked deeply at the alternatives (cf).)

You can call this marketing if you want, although I don't think that that's a useful label for what is really happening. I call this 'trying' versus 'not trying'. If you don't try hard and work hard to become a replacement init system, it should be no surprise when you don't.

(In particular, note that SysV init is not a particularly bad init system so it should be no surprise when it is not particularly easy to displace.)

Beyond that I have some degree of experience with one of these alternate init systems, specifically DJB daemontools, and I've looked at the documentation for the other two. Speaking as a system administrator, systemd solves my problems better. The authors of systemd have looked at problems that are not solved by SysV init and come up with real solutions to them. Many of these problems are not solved by any of the alternatives that Felker put forward. In specific, often the alternatives assume (or require) cooperative daemon processes in order to fully realize their benefits; systemd is deliberately designed so that it does not and can fully manage even existing obstreperous Unix daemons with their willful backgrounding and other inconvenient behaviors.

(I don't know the field of Linux and Unix init-like systems well enough to say whether or not features like socket activation and clever use of control groups are genuinely novel in systemd or simply the first time I've become aware of them. They do feel novel.)

Since that may not be clear, let me be plain: systemd is a better init system than the alternatives. It does more to solve real problems and it does it better. That alone is a good reason for it to win in the practical world, the one where people care about getting stuff done. That systemd is not necessarily novel or the first to come up with the ideas that it embodies is irrelevant to this. Implementation matters more than ideas.

(Arguably it's an advantage that systemd feels no urge to reinvent different wheels when perfectly decent ones exist.)

PS: Please note that the reason that Unix itself succeeded is not its ideas alone, it is that Unix implemented them very well. A number of Unix's ideas are both great and novel, but a bad implementation would have doomed the whole enterprise. The fate of good ideas with a bad implementation is to be reimplemented elsewhere, cf the Xerox Alto and for that matter the Apple Lisa.

PPS: Also note that the one serious competitor to systemd is Upstart, which is also the product of a great deal of work and polishing.


Comments on this page:

By anon at 2014-02-11 18:36:25:

SMF

By cks at 2014-02-11 19:30:11:

Among many other things, SMF is not an option on Linux. I will leave it at that for now.

(Oh, sure, someone might port it and then go to all of the work to try to make it a viable replacement on some Linux distribution. But right now they aren't. See all of what I wrote about people not doing that for other init systems. SMF is not a candidate to replace init systems in general because no one is doing the work to make it one.)

My main concern with systemd is the tight coupling it needs: cgroups && autofs4 && tmpfs && fanotify && SELinux && (k)DBus. Oh, and let's introduce a new logging infrastructure while we're at it. Let's also export stuff in JSON and have an embedded HTTP server (for syncing)! While we're at it take stuff that was independent (e.g., udev) and merge it in as well! Also, replace inetd, acpid, and watchdog! And QR codes!

Really? All that to start the system? I've yet to use it, but from all the talk it sounds like it may julienne as well.

Perhaps I don't get out enough, but I've never really found large problems with SysV init scripts as found in (say) Debian 6+, which has parallel start-up capabilities. As does the BSD rc.d system.

I've been around in IT for "only" about ten years and have run Solaris 6-10 (x86 and SPARC), FreeBSD 4-9, Linux (RH, Deb/Ub, Slackware), etc., and I'm not seeing it. Currently in a Debian/Ubuntu shop, so perhaps when Debian 8 rolls around with the New World Order things will clearer to me.

David Magda: "just to start the system" is a fundamental misconception, and you're doomed to misunderstand any complex init system (inc. systemd and upstart and, hell, modern SysV) if that's what you think they're doing.

If all you want to do is start the system, a series of bash scripts will do the job. That's not what init daemons do any more, though, and hasn't been since, er, probably before I was born (1982).

systemd does a vast array of rather cool things, but it's enough to explain quite a bit of systemd's and upstart's complexity to note that they don't just start processes, they manage processes - or rather, they manage groups of processes. Your init system doesn't just start httpd, it stops it and restarts it too.

This sounds pathetically simple, but in fact just that much is the point at which SysV starts creaking. Service management has always been a glued-on layer in sysv right from the start, and it's never been a particularly good one. sysv init systems barely really know anything at all about the 'services' they're managing - to them a service is a bash script, more or less. It's the bash script's job to know what processes are a part of the service and keep track of all the possible states they might wind up in for the purposes of starting, stopping and restarting the service. Unsurprisingly, they generally aren't terribly good at this. If your 'service' is one daemon which can be a) not running or b) running it's just about manageable, but once it's sixteen processes with complex dependencies on seven other services, as is not at all uncommon in just about any modern *nix system doing anything at all, your bash script house of cards starts going distinctly wobbly.

It's pretty hard to argue that it makes a lot of sense to do all this work:

a) in bash scripts at all b) in one bash script per service (good lord) c) really, anywhere but the init daemon itself

one trivial example is restarting server processes automatically if they crash. Lennart has pointed out that there are probably several thousand implementations of this out there in the wild somewhere, including several attempts to do it 'generically' and far, far, far, far more that are wired into some particular daemon's init script (maybe just for a single deployment of said daemon!) and written and maintained - probably not terribly well - by some overworked sysadmin. This is not a recipe for consistent and reliable behaviour.

So, systemd can restart a service (unit) if it crashes, if you like. If you want a systemd unit you have deployed to do this, you just stick a line of configuration in the unit file, and you're done.

That seems a hell of a lot better, doesn't it? It may make the init daemon 'more complex', but if you take a sufficiently wide view, it actually reduces the complexity of The Ecosystem as a whole by a rather significant value.

A lot - a whole damn lot - of systemd's 'complexity' boils down to similar things, because a lot of the complexity that's been added to systemd is a direct result of feedback from sysadmins on the wobbly messes they've been forced to build on top of sysvinit. No-one really likes writing or maintaining those wobbly messes, and they're generally rather happy to have them outsourced to something more capable...

By Perry Lorier at 2014-02-11 22:54:37:

There are a lot of moving parts in SystemD that are all tied together into the one privledged process on the entire machine.

I tend to spend a lot of time experimenting with new features, playing around with cgroups, namespaces, capabilities and weird networking things. I'm concerned that I'm either going to have to throw out all of systemd because it's philosophy doesn't see what I'm trying to achieve as being a useful feature (eg playing with some of the weird edge cases in cgroups, or doing bizarre things with namespaces), or have to upgrade systemd every few weeks to a buggy, unstable version to get support for a feature I'm wanting to experiment with.

I think that sysvinit is old, and needs revitalisation. But the unix way is a lot of individual components, doing one thing well, and having a well defined interface (usually a command line) between them.

I fear that systemd's integration also is going to push people towards having one model of maintaining a system (eg systemd-login sets up parameters it likes for each login, what if I want to change them? There's no standard "hook" to say this is what I want to do, it's either an option in the config file, or it's impossible without replacing systemd-login. And since all the cgroups stuff has to go via systemd, if systemd doesn't support it, you need to replace all the systemd components). I feel at this early stage of cgroups development this is very dangerous, and needlessly ignoring many possible useful approaches. This sounds like an advantage as it makes it easier for developers to have only one simple, confined model to code towards, but since systemd isn't portable to any other kernel, you're going to have to support other systems anyway.

Systemd is opinionated, which is good thing in a distro. You want all the niggly decisions made for you. It's bad in a piece of software, especially in a piece of software that is so fundamental to the entire machine. It provides no place for a distro (or sysadmin) to experiment, and tweak it's environment to find more innovative solutions to problems.

I fear that we're going to get a massive short-term boost from systemd, then spend the next 3-4 years replacing bits of it with other, more flexible components, meanwhile distros becoming less innovative, and more same-y. In 5 years time we're going to be telling new sysadmins about dark difficult to understand corners of the machine left over from some now obsolete part of systemd that is preventing our machine from booting.

By Richard at 2014-02-12 01:20:37:

In what sense do you think that "Unix implemented [its ideas] very well"? I seem to recall that prior to GNU, Unix was pretty poorly implemented. It had memory protection a couple decades before PC operating systems, because it needed it: its typical response to any problem (such as "I didn't expect 81 characters on that line") was to segfault.

I think Gabriel was mostly right ("Worse is Better"). Unix was eventually implemented very well, because its core ideas were so simple, and so the folks at Berkeley, Stallman at MIT, Torvalds in Europe, etc., all chose (of all the possible operating systems in the world) to clone it. That can't be about quality of the original implementation, because they made their own implementations.

"There are a lot of moving parts in SystemD that are all tied together into the one privledged process on the entire machine."

This just isn't true. systemd is now a codebase containing rather a lot of different processes. pid 1 does not do anywhere near all of systemd's work.

By Jude at 2014-02-12 03:06:41:

"This just isn't true. systemd is now a codebase containing rather a lot of different processes. pid 1 does not do anywhere near all of systemd's work."

Just because systemd is comprised of many daemons and tools, doesn't mean that they're easily separable. For example, I can't run logind without also running systemd as PID 1. This is the concern the parent was getting at--I either have to run a lot of systemd components at once, or go without any of them. The fact that the components happen to be implemented as separate daemons is irrelevant to this--tight coupling between components makes systemd an inherently monolithic system.

I personally cannot use systemd because I need to be able to muck around with cgroups and chroots on a daily basis. If only systemd's components were loosely-coupled, I could simply disable the parts that get in my way.

People never mention which problems systemd solves and how.

Supervision does not assume the process behaves in a certain way, systemd unit and its logger on the other hand has quite strong assumptions.

cgroups are nice tools at all but they aren't a requirement for supervision and people might need not use the latest and greatest kernel and glibc.

Maybe they even use a different libc.

Yet systemd tries to shoehorn itself by being a compulsory dependency for fringe features. (hi logind and gnome)

By hibbelig at 2014-02-12 06:14:02:

I think that launchd deserves to be mentioned here -- I think it does pretty much the same thing for BSD.

By Tom Gundersen at 2014-02-12 06:36:59:

"Just because systemd is comprised of many daemons and tools, doesn't mean that they're easily separable. For example, I can't run logind without also running systemd as PID 1."

I believe the important point is that, while some (many?) components of systemd may depend on systemd being PID1 (and this is something we don't see as a problem), PID1 itself does not depend on many of the other systemd components being used. Last I checked it only requires udev and the journal. Logind, for instance, is optional.

Adam Williamson wrote:

systemd does a vast array of rather cool things, but it's enough to explain quite a bit of systemd's and upstart's complexity to note that they don't just start processes, they manage processes - or rather, they manage groups of processes. Your init system doesn't just start httpd, it stops it and restarts it too.

Again, perhaps I'm missing something or have not been exposed enough to things in the last decade, but I've never found starting, stopping, and restarting processes a problem with either SysV init or BSD scripts.

one trivial example is restarting server processes automatically if they crash.

I've always been of the opinion that I want to know if it dies by monitoring system, and if the service is critical, I'll have an HA set up of some kind. If a process crashed, I have never thought "I wish my init system could have restarted it". I have always thought "why did it crash? lets look in the logs before restarting".

That seems a hell of a lot better, doesn't it?

I'm sure all these things are the Greatest Thing Since Sliced Bread(tm), but I have never needed them AFAICT—or never known that I needed them ("an unknown unknown" per Rumsfeld). Perhaps I'm the simple folk of the sysadmin village, with modest needs. And if other people find them handy I don't want to stand in the way of Progress(tm). But a lot of this seems Goldberg-esque in some ways.

The tight coupling also casts a shadow on my mind. SELinux for example is something I have had no end of problems with over the years (BSD's Capsicum looks much saner), and while I used ACLs over NFSv4 (to get around the 16 group limit) I wouldn't know what to do with it in the base OS.

Again, I haven't used SystemD (yet?), and so this could all be a misconception on my part.

By Andrew at 2014-02-12 07:47:47:

"one trivial example is restarting server processes automatically if they crash."

Great, lets take really bad practices and amplify them. You should never EVER automatically restart services when they fail, only a hack would support that trash.

What's next, rebooting until it works?

By D.F. at 2014-02-12 09:18:32:

I think you're discounting the marketing argument a bit too much. On the surface, of course Poettering has the ability to bulldoze the fedora technical people -- he works at red hat and despite any assurances I've ever heard, we all know who runs that ship. The adoption of systemd has been, accelerated, to say the least..most FDO stuff seemed to get adopted into distros well before any of it was documented or well tested, and that seems only really possible when you have the 800lb gorilla of the linux world standing behind you.

I'm still a bit ambivalent on systemd. On the one hand, like you said, it solves some problems. But on the other hand, it seems like it demonstrates some philosophical/social models that a lot of us in the freenix world don't seem to like. Poettering seems to really want to emulate the folks in redmond by promoting a linux monoculture and making linux more like windows. Systemd also seems to embody the worst aspects of the free software world -- "Understanding this old code is hard! Let's throw it out and write something from scratch!" -- there have been other people that explain much better than I ever will why that's a bad idea.

By James Watson at 2014-02-12 11:33:11:

The other (more) important reason than the social one is the financial one: the dev wasn't working for free, he's paid by Red Hat, and Red Hat explicitly favours systemd and its policy and interest is obviously in systemd to become widespread. In other words, it was his job and he had (probably) the most influential Linux company backing him up. Now imagine the dev is a "nobody", like a student working for free, not backed by a company, not to mention by one such as Red Hat.. would he win over Upstart? I doubt.

By cks at 2014-02-12 12:33:04:

I think that people are both under-rating the people on the Fedora Engineering Steering Committee and misreading Red Hat's core motivations here. First off, note that Fedora moved to Upstart before they moved to systemd, despite it being from Canonical and having various other pragmatic issues. Fedora and FESCO were clearly looking for a better init system regardless of who it came from. And FESCO itself has not given systemd a clean ride; for example they deferred it from Fedora 14 to Fedora 15 at the last minute because they didn't feel comfortable with it (via @mjg59).

My view of Red Hat's core motivation is that they use Fedora as a proving ground for things that will be going into Red Hat Enterprise Linux. This means that RH doesn't want their stuff so much as good stuff. To put it bluntly, Red Hat is not well served by shoving crap into Fedora and then shipping that crap in RHEL; in fact that would be a fairly bad disaster. If systemd is a bad idea or badly executed, they very much do not want to force it into Fedora and then RHEL just because it was written by someone they employ. I personally believe that Red Hat's engineering management is smart enough to understand this.

(On a purely commercial level Red Hat cares about Fedora not having disasters because if Fedora has disasters people abandon it and then it stops being a good testing ground for RHEL. I personally believe that Red Hat cares about Fedora beyond that purely commercial level, but you may not.)

By Jude at 2014-02-12 13:38:44:

"I believe the important point is that, while some (many?) components of systemd may depend on systemd being PID1 (and this is something we don't see as a problem), PID1 itself does not depend on many of the other systemd components being used. Last I checked it only requires udev and the journal. Logind, for instance, is optional."

The problem with this becomes obvious when layers above systemd depend on one or more of its daemons. GNOME depends on logind, and logind depends on systemd PID 1. What I want is to run GNOME without having to run systemd. I could do this if logind didn't depend on systemd PID 1.

I cannot have systemd running as PID 1. I need to have chroots and direct cgroups control. Thanks to the systemd developers, I now get to choose between doing my job and having GNOME. Claiming that "GNOME made the bad decision to depend on logind" isn't helpful, especially since (1) session management and system initialization are separate concerns that should NOT be tightly coupled, and (2) Lennart himself has advocated for this dependency on the GNOME mailing lists[1].

Although it's plausibly deniable, these actions make me suspect that systemd developers want to force adoption by making it damn nigh impossible to avoid.

[1] https://mail.gnome.org/archives/desktop-devel-list/2011-May/msg00427.html

It's actually funny to read all the stuff about red hat, as if we were some kind of borg. I wish I could post the archives of the red hat internal mailing lists from when Lennart first started pushing systemd internally. Do you really think everyone said 'oh, he's @redhat.com so t must be fine"? Heh, no. Red hat does not work that way. Lennart also doesn't work by getting some senior management type on board and then having them ram his idea down everyone else's throat; he works by going into the trenches and convincing people he's right. You think a tech company the size of red hat doesn't have a curmudgeonly sysadmin contingent? All the battles that are being fought now with Debian were fought before with Arch, and before that with fedora, and before THAT within red hat. I have like five strata of the same old 'sysv should be good enough for anyone'stuff in my archives of various mailing lists, and every time, once people actually went and looked with an open mind at what systemd does and why, the majority got behind it.

By Tom Gundersen at 2014-02-12 17:52:49:

"I need to have chroots and direct cgroups control. Thanks to the systemd developers, I now get to choose between doing my job and having GNOME."

systemd does not prevent you from messing around with cgroups. One day, the kernel will probably only allow one userspace process to be in charge of your cgroups, and on systemd systems, this would be systemd. However, we are not there yet, and once we are, avoiding systemd will not avoid that restriction...

@lu: "cgroups are nice tools at all but they aren't a requirement for supervision"

Oh yes they are.

One example may suffice: Try to stop or restart a sufficiently complex Apache installation. Multiple sub-websites start their own FastCGI handlers, and some of these habitually do cute things like needlessly daemonizing themselves and whatnot.

With systemd it's all in one cgroup. Stop the service, poof, the whole mess is gone.

By Jude at 2014-02-13 03:52:14:

"One day, the kernel will probably only allow one userspace process to be in charge of your cgroups, and on systemd systems, this would be systemd. However, we are not there yet, and once we are, avoiding systemd will not avoid that restriction..."

<rant> But that's exactly my problem! While having multiple cgroup writers is supported for now, it's slated to be deprecated and removed. However, I will still need to be able to control cgroups, and systemd will be in the way. I will have to fight systemd in order to keep the things that I was able to do before! Do you see how unacceptable this is? Tight coupling between systemd and the rest of userspace breaks working code. That is a HUGE turn-off, and a BIG no-no in software engineering.

There's a reason Linus chews out kernel developers for breaking userspace. Lennart and company do not seem to have this level of discipline. </rant>

Look, I get that cgroups is great for managing services. My team and I do this every day. We've been doing it with vservers since before cgroups was added. Containerizing services is a great idea, and I'm glad the broader Linux community is starting to apply it to managing daemons.

What the systemd proponents don't seem to realize is that while systemd is great if you can to do things the systemd way, it utterly sucks if you can't. Thanks to systemd's cross-layer dependencies and tight coupling between components (both of which could have been avoided, I might add), the option of "don't use it if you don't like it" is increasingly untenable. Trust me, I would love to avoid using it, at least until it stops getting in my way.

@Matthias Urlichs: cgroups are not required for this kind of situation. Containers can achieve the same goal in a different way, this is one of the obvious use cases of Docker for example.

-- Arnaud

By D.F. at 2014-02-14 12:50:29:

Adam,

Your post points out the problem.  All this discussion happens internally to RH.  Lennart does his convincing there and then the 800lb gorilla of the linux world takes over.

From the outside looking in, where's the discussion leading into this? We end up just seeing the gorilla charging in one direction.

In the end, it doesn't matter to me...as a sysadmin, I have to embrace the suck either way. Usually every time someone decides a complete rewrite is in order.

I wrote:

My main concern with systemd is the tight coupling it needs: cgroups && autofs4 && tmpfs && fanotify && SELinux && (k)DBus. Oh, and let's introduce a new logging infrastructure while we're at it. Let's also export stuff in JSON and have an embedded HTTP server (for syncing)! While we're at it take stuff that was independent (e.g., udev) and merge it in as well! Also, replace inetd, acpid, and watchdog! And QR codes!

Oh, look: as of v209, there is now a "networkd" in the system.git repository:

A new component "systemd-networkd" has been added that can be used to configure local network interfaces statically or via DHCP. It is capable of bringing up bridges, VLANs, and bonding. […]

http://cgit.freedesktop.org/systemd/systemd/tree/NEWS

To add to my pile-on, per 211

  • systemd-gpt-auto-generator is now able to discover /srv and root partitions in addition to /home and swap partitions. […] This allows booting without /etc/fstab and without root= on the kernel command line on appropriately prepared systems.

Can someone please tell me what problem/s SystemD is trying to solve besides "all" of them?

By cks at 2014-03-13 17:13:13:

I don't particularly see serious problems with systemd solving these problems, especially in non-mandatory components. Network setup has long been a non-standard mess, for example. It's especially useful to integrate it into systemd for startup ordering dependencies, so that systemd can have a native way to say 'only do <X> after <Y> network is available'.

By CWilson at 2014-03-16 00:52:53:

cks said, "I don't particularly see serious problems with systemd solving these problems..."

I do and you should too... It's a recipe for catastrophic failure that has been warned of again and again by people whom have been there and done that. Later, they tried to teach the rest of the world not to make those same mistakes, a lesson that seems to be increasingly lost in today's software designs.

I've got a stack of PDFs and web-links written by Phd.'s and insightful others about the dangers of complicated and monolithic systems. Your device management problem is but one small issue in what I think will eventually reveal itself to be a whole host of problems.

Same PhDs that advocated microkernels over monolitic Linux, no doubt.

As much as I hate to admit it, this blog entry is absolutely right. It's not too late, but if we (the systemd adversaries) want to make our voice heard, the work needs to be done. A real alternative must be written, and then promoted, to address all the real issues that systemd is addressing. systemd is a horrible solution to those issues, but it is the only piece of software that acknowledges them; we need to provide a technically and politically better piece of software that scratches the same itch, and then make it known, documented and available to mainstream distribution.

Until this work is done, no amount of bitching and moaning is going to change anything.

By Bill at 2015-02-09 15:52:57:

It really helps to give a real world example to understand what problems one wants solved, and why it is complex. Lets take what seems like a trivial example. Lets say I want to change the default language on my system from en_CA to en_US. With a SYS V system traditionally what would happen is you would update the /etc/profile file. Then however, all your running daemons and such would still be using the en_CA. The only way to update everything would be a reboot.

Ideally what you want to happen when you change a setting like this is for a notification to go out to all the daemons. So depending on the way the daemon is written it can update on the fly with the new value, or automatically be restarted. If you have an older process that say rereads configuration data when receiving a special signal, then you want to send that signal to the process. So you need some sort of mapping, what happens to my daemon when these sets of settings are changed, with the ability to tailor that action to what is needed.

With systemd you'll find you are far less likely to need to reboot to change some trivial setting than you would with a SYS V. However, you'll also find this is part of what makes systemd far more complex to use.

Written on 11 February 2014.
« My dividing line between working remotely and working out of the office
Init's (historical) roles »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Feb 11 17:36:32 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.