Thinking about how to split logging up in multiple categories et al

April 21, 2014

I've used programs that do logging (both well and badly) and I've also written programs that did logging (also both reasonably well and badly) and the whole experience has given me some views on how I like logging split up to make it more controllable.

It's tempting to say that controlling logging is only for exceptional cases, like debugging programs. This is not quite true. Certainly this is the dominant case, but there are times when people have different interests about what to log even in routine circumstances. For example, on this blog I log detailed information about conditional GETs for syndication feeds because I like tracking down why (or why not) feed fetchers succeed at this. However this information isn't necessarily of interest to someone else running a DWiki instance so it shouldn't be part of the always-on mandatory logging; you should be able to control it.

The basic breakdown of full featured logging in a large system is to give all messages both a category and a level. The category is generally going to be the subsystem that they involve, while the level is the general sort of information that they have (informational, warnings, progress information, debugging details, whatever). You should be able to control the two together and separately, to say that you want only progress reports from all systems or almost everything from only one system and all the way through.

My personal view is that this breakdown is not quite sufficient by itself and there are a bunch of cases where you'll also want a verbosity level. Even if verbosity could in theory be represented by adding more categories and levels, in practice it's much easier for people to crank up the verbosity (or crank it down) rather than try to do more complex manipulations of categories and levels. As part of making life easier on people, I'd also have a single option that means 'turn on all logging options and log all debugging information (and possibly everything)'; this gives people a simple big stick to hit a problem with when they're desperate.

If your language and programming environment doesn't already have a log system that makes at least the category plus level breakdown easy to do, I wouldn't worry about this for relatively small programs. It's only fairly large and complex programs with a lot of subsystems where you start to really want this sort of control.

Sidebar: the two purposes of specific control

There are two reasons to offer people specific control over logging. The first is what I mentioned: sometimes not all information is interesting to a particular setup. I may want information on syndication feed conditional GETs while you may want 'time taken' information for all requests. Control over logging allows the program to support both of us (and the person who doesn't care about either) without cluttering up logs with stuff that we don't want. This is log control for routine logs, stuff that you're going to use during normal program operation.

The second reason is that a big system can produce too much information at full logging flow when you're trying to troubleshoot it, so much that useful problem indicators are lost in the overall noise. Here categorization and levels are a way of cutting down on the log volume so that people can see the important things. This is log control for debugging messages.

(There is an overlap between these two categories. You might log all SQL queries that a system does and the time they take for routine metrics, even though this was originally designed for debugging purposes.)


Comments on this page:

By Perry Lorier at 2014-04-21 06:55:15:

Google's logging library (https://code.google.com/p/google-glog) has levels (in decreasing severity) FATAL, ERROR, WARNING, INFO, 0, 1, 2. By default ERROR and above are logged to stderr, and INFO and above are logged to a file (including ERROR and FATAL).

You generally set the logging level globally (for log files or stderr independently). You can set up logging per module with --vmodule=gfs*=1,mapreduce=2 for instance. These are globbed against the file that they're logged from.

Where the real usefulness comes from is that every Google binary all uses the same logging infrastructure, so you can assume log files are in the same place, they have the same format, and that the same command line flag will enable/disable logging.

By James (trs80) at 2014-04-21 11:45:37:

Splunk (unsurprisingly given their product is all about logging) have some thoughts on logging and best practices.

Written on 21 April 2014.
« A heresy about memorable passwords
The question of language longevity for new languages »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Apr 21 02:39:55 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.