An important note about multi-line log message formats

April 25, 2011

I was vaguely planning to write a blog entry using some current stats from our anti-spam system (either 'how many connections were on the Spamhaus Zen' or 'how correlated is the results of the Spamhaus Zen and our commercial filtering software'). Then I went and looked at the format of the logs from our mail system, which it turned out were not set up to make this at all easy.

In our configuration, Exim logs a warning message when a connection from a DNSBL-listed IP address first reaches the RCPT TO: phase of the SMTP conversation. It also logs a line when it accepts a message. Unfortunately, there is nothing to uniquely and simply tie these two lines together, such as a process ID.

Which leads me to the important note:

Any time you log multiple lines for a single thing, you should make it easy to associate all the lines together.

(Also, you should make this happen by default in your logging configuration.)

Generally this means that all of the log messages should include some unique key (for bonus points, have them all include it in the same spot in the line). What is a suitable unique key depends on how your system works; in many situations the process ID is the right key (and besides, sysadmins like to know it in general because other log messages often mention the PID). In some circumstances you may have to invent your own unique key somehow.

A related issue for multi-line logging is that you should also make it clear when a sequence of messages has ended. Sometimes this is already inherent in what you're logging, but sometimes this needs a new 'end of activity' message. This message may seem redundant and pointless, but it has an important purpose; it lets log processing software know that it can now discard all of the tracking information it was keeping for that 'session'. Without an end message, log processing software either has to resort to heuristics for when it can throw away tracking information or allow its memory usage to grow endlessly. (Often software effectively uses both at once by having very conservative heuristics.)

(Exim does have an 'end of processing' log message for individual mail messages.)

Written on 25 April 2011.
« Notes to myself on the priorities of Linux routing policy rules
A quick look at some spam filtering stats from our system »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Apr 25 01:21:22 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.