2014-12-04
Log retention versus log analysis, or really logs versus log analysis
In a comment on my entry on keeping your logs longer, oz wrote:
yea but. keeping logs longer is not particularly interesting if you have no heavy duty tools to chew through them. [...]
Unsurprisingly, I disagree with this.
Certainly in an ideal world we would have good log analysis tools
that we use to process raw logs into monitoring, metrics data, and
other ongoing uses of the raw pieces we're gathering and retaining.
However ongoing processing of logs is far from the only reason to
have them. Another important use is to go back through the data you
already have in order to answer (new) questions, and this can be
done without having to process the logs through heavy duty tools.
Many questions can be answered with basic Unix tools such as grep
and awk
, and these can be very important post-facto ad-hoc
questions.
A lack of good tools may limit the sophistication of the questions you can ask (at least with moderate effort) and the volume of questions you can deal with, but they don't make logs totally useless. Far from it, in fact. In addition, given that logs are the raw starting point you can always keep logs now and build processing for them later, either on an 'as you have time' or an 'as you have the need' basis. As result I feel that this is a the perfect is the enemy of the good situation unless your log volume is so big that you can't just keep raw logs and do anything with them.
(And on modern machines you can get quite far with plain text, Unix tools, and some patience, even with quite large log files.)
If you want the really short version: having information is almost always better than not having information, even if you're not doing anything with it right now.