I wish systemd would get over its thing about syslog

April 10, 2015

Anyone who works with systemd soon comes to realize that systemd just doesn't like syslog very much. In fact systemd is so unhappy with syslog that it invented its own logging mechanism (in the form of journald). This is not news. What people who don't have to look deeply into the situation often don't realize is that systemd's dislike is sufficiently deep that systemd just doesn't interact very well with syslog.

I won't say that bugs and glitches 'abound', because I've only run into two issues so far (although both issues are relatively severe). One was that systemd mis-filed kernel messages under the syslog 'user' facility instead of the 'kernel' one; this bug made it past testing and into RHEL 7 / CentOS 7. The other is that sometimes on boot, randomly, systemd will barf up a significant chunk of old journal messages (sometimes very old) and re-send them to syslog. If you don't scroll back far enough while watching syslog logs, this can lead you to believe that something really bad and weird has happened.

(This has actually happened to me several times.)

This is stupid and wrongheaded on systemd's part. Yes, systemd doesn't like syslog. But syslog is extremely well established and extremely useful, especially in the server space. Part of that is historical practice, part of that is that syslog is basically the only cross-platform logging technology we have, and partly it's because you can do things like forward syslog to other machines, aggregate logs from multiple machines on one, and so on (and do so in a cross-platform way). And a good part of it is because syslog is simple text and it's always been easy to do a lot of powerful ad-hoc stuff with text. That systemd continually allows itself to ignore and interact badly with syslog makes everyone's life worse (except perhaps the systemd authors). Syslog is not going away just because the systemd authors would like it to and it is high time that systemd actually accepted that and started not just sort of working with syslog but working well with it.

One of systemd's strengths until now has been that it played relatively well (sometimes extremely well) with existing systems, warts and all. It saddens me to see systemd increasingly throw that away here.

(And I'll be frank, it genuinely angers me that systemd may feel that it can get away with this, that systemd is now so powerful that it doesn't have to play well with other systems and with existing practices. This sort of arrogance steps on real people; it's the same arrogance that leads people to break ABIs and APIs and then tell others 'well, that's your problem, keep up'.)

PS: If systemd people feel that systemd really does care about syslog and does its best to work well with it, well, you have two problems. The first is that your development process isn't managing to actually achieve this, and the second is that you have a perception problem among systemd users.


Comments on this page:

The reported issues looks like plain bugs (the user/kernel even seems to be trivial), nothing really fundamental.

Are those issues been reported on bugs.freedesktop.org?

By cks at 2015-04-11 15:37:21:

The reported issues are symptoms of an underlying problem. 'Simple bugs' of this magnitude do not just happen and then make it into releases; they are allowed to happen and make it into releases (and then stay there) because people do not care enough about the area that they are dealing with.

The easy to reproduce bug was reported to my distribution for vague reasons; my effort on this bug report has not yet been of any use to us (ie, the bug remains unfixed in the environment we really care about). The one I have no reproduction for was not, because I have better things to do than bang my head against the wall (or spend time writing careful reports that will ultimately be ignored).

(In addition I do not, at this point, percieve systemd upstream as being particularly welcoming to vague and fuzzy bug reports of 'this happens some time on the non-current version of systemd shipped by ...' for issues in systemd's syslog interface. I have better things to do than file upstream bug reports that will irritate people and go nowhere. If systemd is actually welcoming of these bug reports, well, they have a perception problem again.)

By Lennart at 2015-04-12 08:07:18:

I think there are some misconceptions in this blog story.

Look, with journald we tried very careful to provide the best compatibility with syslog we can. Thus, we'll forward all log data we get to a running, additional syslog daemon if there is one. This of course includes the data we get from libc syslog() clients via /dev/log (where we even go as far as faking the SCM_CREDS data if we can, to adjust it to the original sender), but also stuff we get via our native journal protocol, or via stdout/stderr of the various daemons. Thus, when you run your classic syslog daemon on a systemd/journald system it actually is at least as useful as it would be on a sysv daemon, if not more useful, since you'll get stdout/stderr output of the daemons which you previously didn't. I spent quite some time on this, to carefully make this work neatly, and comprehensively.

Now, as it turns out modern rsyslog actually doesn't use this forwarding functionality, anymore! The rsyslog developers changed their code to no longer accept the forwarding, but instead read the journal data from the journal files on disk directly. They did this for two reasons: firstly, this opens up to rsyslog all the metadata the journal collects in addition to the traditional syslog message. Secondly, it closes the "logging gap" during shutdowns and early boot: with traditional syslog logging would only be available as long as the syslog daemon was running but neither in the initrd + early boot, nor during shutdown -- even though the logging then is often the most interesting. The journald captures all that log data, it is already available in the initrd, and stay around very late at shutdown. By reading the data from the journal files rsyslog could take benefit of this, and suddenly cover pretty much the entire lifecycle of the system!

Effectively this change moved things from a "push" model (where systemd converted its data into syslog compatible format and send it to rsyslog), into "pull" model (where rsyslog actively reads it in the journal native format and converts it on its own). This of course means that this logic and conversion algorithm is outside of systemd's/journald's control, it's implemented in rsyslog's codebase and not journald's now.

Regarding the bugs you point out: the mis-filing under user instead of kernel, is probably a conversion problem, since we get the kernel messages without any facility, and we won't add any facility to it then, and the conversion code that needed a facility id then made up "USER" rathern than "KERNEL", because it naively used that as fallback. This would be a bug in the conversion logic, and thus most likely not in journald, but rsyslog (see above...).

The issue with the old messages ending up in rsyslog: the way pulling data out the journal works is that you pull everything from the position you last read from. To make returning to the same position easy we introduced a concept we call "cursor" which is a stringified encoding of a log position. When reading messages from the journal, rsyslog should store this cursor away, so that when rsyslog eventually returns, it can continue from the last cursor it saw. To make this stable, it needs to store this cursor away safely, sync it to disk, and so on. If it loses the cursor, it needs to read the full journal again from the beginning. Now, I have no idea what the bug you point out is really about. It could be that rsyslog didn't store the cursor away properly, or it could be a problem with seeking in journald (we had some issues there, which should be fixed now, but they mostly resulted in log messages being skipped erroneously, but not repeatedly seeked to). Either way, it's really not just a simple case of blaming that journald wouldn't care about syslog: such a bug will affect all reading tools, including our own journalctl...

In general I believe that syslog is a pretty awful protocol. It's very badly standardized, it's has no security provisions, it requires complex normaliuation, it is lossy, it is not stable regarding ordering, it doesn't convey time zone information, or structured data, or trust and so on. But also, sure, it's pretty trivial to implement and understand, and it supported in a variety of devices. We hence support it too: we read it from /dev/log, and we forward it to a local syslog daemon if this is enabled (see above), we even support it's fields pretty comprehensively (see the various SYSLOG_xyz fields we export), our default output of "journalctl" is syslog compatible even. However, we do not support it directly over the network, simply because it's still managable for local logging, but opening up for the network would mean normalization, would mean dealing with timezone issues, ordering, lossiness and so on. Because of all of this, we decided to just go a different route there: accept that all these issues are not trivial, allow people to use rsyslog for this, and integrate nicely with this. Also, support a new HTTP-based native protocol for transferring the native journal data. And that format is actually really really obvious: it's HTTP GET and POST for downloading and uploading logs, and the payload can either be in syslog-compatibel log format or in json. We hence didn't really invent a new protocol here, we just use what is already there in HTTP and JSON, and thus gain solutions for much of the issues pointed out above for free, via HTTP, HTTPS and so on. And all that so easily accessible, that you can pull the data of the journal with any stupid programming language that supports HTTP and JSON...

Anway, I think you are a bit unfair here, and I am pretty sure we did the right thing on this one: let's summarize:

- with journald in the mix you get at least as good if not better syslog behaviour then with classic sysv

- the conversion to syslog now sits in rsyslog, no longer in journald

- syslog/udp is didn't age well, http+json is probably the better alternative...

Lennart

By Deran at 2015-04-13 06:22:27:

I have to admit that my eyebrows shot up at your mention of being angry that systemd might think it can "get away with this." The past year has been one long stream of frustrated users making that exact complaint as more and more of Linux became swallowed by systemd, especially when it was less functional or stable than something it replaced.

So yes, systemd (or rather its creators/proponents) /knows/ it can get away with it. Its developers have more than enough power within the big-name distros to have proven "systemd" can do/replace whatever it wants and anybody with objections can either use a fringe distro or abandon Linux altogether.

By cks at 2015-04-13 15:10:32:

On the mis-filing bug: Red Hat accepted this as a systemd bug in RHEL 7, bz #1113215. There is no upstream bug linked there, but the problem was present in Fedora and is now fixed as well. Looking at systemd sources this appears to have been fixed in July of 2014 in commit 36dd072cdf03 (which is a journald change).

On the repeated old messages bug: well, what do you know. This does appear to be an issue with rsyslogd not updating /var/lib/rsyslog/imjournal.state. Hopefully rsyslog actually is getting shut down by systemd before / (and thus /var/lib) either disappears or becomes read-only. (journalctl claims that this happened at the most recent reboot, but who knows.)

Written on 10 April 2015.
« My Firefox 37 extensions and addons (sort of)
Spam victims don't care what business unit is responsible for the spam »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Apr 10 23:42:47 2015
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.