You should keep your system logs for longer than you probably are

November 30, 2014

One simple thing you can do to improve your life is to make your machines keep their logs for longer than they currently do. Most systems ship with relatively short log retention defaults that basically date from the days when systems had what are now very small disks and sysadmins got really grumpy about logs eating up lots of scarce disk space. Those days are over now for most systems; for example our new servers come with 500 GB HDs as the default. A 500 GB disk will hold really quite a lot of logs. SSDs change this a bit, but even small SSDs these days are in the 64 to 80 GB range and you usually have to work hard to get a system install to use more than a few GB. Even on SSDs we wind up with tens of GB free.

(Of course this goes well with having a central syslog server, because usually you can easily give the central syslog server a lot of disk space to store lots of logs. This isn't true in big environments where you have a lot of log traffic in the aggregate, but most sysadmins are not in such environments.)

The core reason to keep logs for a relatively long time is that you don't always find out about things that you want to look into right away. The longer you keep logs for, the further you can look back into history to see things. The obvious case where this is really important is if you ever experience a system compromise or security problem that you didn't detect immediately. But you can also be looking back to see how frequent something is, or even doing long term historical analysis on how things have changed over time.

Having said that, there are some concerns involved if you're thinking of doing this. We're lucky enough to be in a situation without real concerns about information sensitivity and anti-retention policies. For information sensitivity, we don't have any really sensitive logs that we have to closely safeguard and we consider all of our machines about as secure as each other.

(Of course I am a big fan of not logging sensitive information you don't need.)

Once you've made the decision to keep your logs for a relatively long time, there are of course a bunch of things you can do to improve the situation even more. The obvious ones are centralizing your logs and setting up a long-term archival system for them so that if need be you can go back really extended periods of time. If you do periodic archival system backups, for example, you can make sure that your log storage is captured in the backups and that you keep enough logs to cover at least the full time interval between those archival backups.

(This elaborates on a tweet of mine.)


Comments on this page:

yea but. keeping longs longer is not particularly interesting if you have no heavy duty tools to chew through them. splunk is very, very expensive. as far as I can tell, there is nothing even remotely competitive. there are bits and pieces with eg. hadoop but really, mostly mediocre stuff.

By MikeP at 2014-12-23 14:11:43:

oz, the Cool Kids nowadays seem to be using ELK (Elasticsearch, Logstash, Kibana). There's log aggregators / searchers cheaper than Splunk, although they do tend to still be spendy. Our current approach is QRadar (one such example) for Logs That Matter, and we're standing up an ELK environment for the rest.

Written on 30 November 2014.
« TLS versions in connections to my spam-catching sinkhole SMTP server
The unreasonable effectiveness of web crawlers »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Nov 30 22:10:24 2014
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.