A log message format mistake that I've made

December 5, 2010

As an illustrated example of how not to do log messages, here is a mistake of mine that I recently had my nose rubbed in. Suppose that you have a service that authenticates requests in various ways; sometimes the request has enough information to grant approval right away, but other times you need to do a callback of some sort. Of course you want to log the success or failure of these requests so that when something is going wrong you can rule out the authentication system.

When I was doing this, I wrote two log messages:

checked X result 0 (succeeded)
allowed X without callback check

(The former also had the obvious variants if the callback check failed.)

These are perfectly good messages individually; they are clear and tell you what happened. They even have the X in the same place in the message. The problem with them is that I need to use two grep patterns in order to see all successful request authentications, which is both more annoying and something extra to remember (especially if one of the two variants is much less common than the other).

As much as possible you want people to be able to use a single regular expression (ideally a simple one) to match every variant of something, even if it makes each individual message uglier. Of course, part of the tricky bit of this is deciding what is just a variant of something else and what is sufficiently different that it should have an entirely new message.

It may help to ask yourself how you'd track things down if there seemed to be a problem with the system, but don't be satisfied with handwaving answers; actually write down the commands you'd use and make sure that they're correct, simple, and would get everything that you're interested in.

(Had I done this when I was writing the messages for this program, I could have saved myself some annoyance later on.)

Sidebar: why less-common variants are extra annoying

Unless you work with a particular system and its logs a lot, you generally don't remember all of the messages that it uses. The usual sysadmin approach is to go look back at a chunk of the logs (often the most recent ones) and then base your grep pattern on what you see. This approach can make it easy to miss less-common variants; they may just not have come up in the logs you looked at to work out what sort of messages the system logs.

(Even when you remember that there is another message format variant, now you have to actually find a message using it.)

Written on 05 December 2010.
« Asking users questions never increases security
What performance anomalies mean »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Dec 5 01:47:43 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.