Why logging to syslog is different than logging to standard error

April 4, 2011

In the previous entry I wrote that logger is not a good solution for converting a program that logs to standard error into a system that logs to syslog. In fact, the issue is more general than just logger; logging to standard error is significantly different from logging to syslog that it is hard to turn the former into the latter.

The issue is that syslog handles a bunch of important things for you that you have to handle yourself in logging to standard error. Syslog automatically attaches a timestamp, the program name, and the process ID (if you ask it to). Since all of these are important pieces of information, when you log to standard error you should be supplying them yourself in order to provide complete logging. If something then simply echoes your full log lines to syslog, you get at least duplicate information; the logs will have timestamps from both syslog and your own program.

(As far as I know there is no way to tell syslog to use your timestamps instead of its own.)

There are a number of solutions, all of them not very great:

  • you can have a logger equivalent that knows about the format of your messages and so strips off the redundant information before feeding it to syslog. It's clearly specific to your program.

  • your program can skip putting timestamps (and perhaps other information) on log messages, leaving that to another program to do if necessary. You now intrinsically require a second program in order to get decent logs, even if you are only logging to a file. Better supply at least a pointer to such a program in your documentation, and expect sysadmins to be unhappy about your system no longer being a self-contained thing.

  • you can just have your program stutter, putting redundant information into its syslog messages. This makes syslog a second class citizen in your logging world.

The net result of all of this is that it is much easier to start with a program designed to log to syslog and extend it to log to standard error than to take a program designed to log to standard error and bolt on an extension to divert the straight log messages to syslog.

(If you can change the program itself, you can do better. But the kind of people who have their program log to standard error and nothing else generally aren't interested in taking patches to add syslog logging. If you're lucky they might take patches to add an option to turn off timestamps, although now the main program configuration becomes dependent on how you're doing logging.)

Sidebar: the problem abstracted

What this boils down to is that log lines are not simply plain text. Instead they are data (the log message) plus some metadata (the time, the program name, the process ID, etc). Many variant logging systems will want to do custom things with various pieces of the metadata; for example, if you're pushing logs into a SQL database you probably want to put the bits of metadata into separate database fields.

The response to this view is that there is more useful metadata than this in typical log messages, so much metadata (sometimes) that there is no sensible way for the programs generating messages to split them up for you. Since you need some custom parsing no matter what, you might as well bite the bullet and do custom parsing of plain text to extract all of the metadata, even generic metadata like the timestamp.

I don't really like this response, but I think it's probably true in practice; I suspect that everyone who is routing log messages from standard programs into custom monitoring solutions has probably written their own parsing and analysis systems that are specific to the exact programs they're dealing with.

(Well, specific overall. I'm sure people write generic components such as a generic log message parsing program that is driven by pattern matching, but it uses program specific patterns that you write.)


Comments on this page:

From 87.79.236.202 at 2011-04-05 00:06:52:

Solution 1b, which seems to me to solve the problem at the right level in one fell swoop: add some simple regex support to logger.

No?

Aristotle Pagaltzis

By cks at 2011-04-05 10:12:14:

It's half the solution. The other half is to understand the problem from the sysadmin perspective: we want your program to come with a complete solution for logging to wherever. So if logger has regex support, your program needs to come with a script with your already-worked-out regexps for your program's messages.

Written on 04 April 2011.
« Logs are not just streams
The story of leaving N% of your filesystem unused for performance »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Mon Apr 4 17:00:10 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.