Graphs are not enough (for your monitoring system)

September 14, 2011

There are a lot of monitoring systems out there that will accumulate historical data and draw you pretty graphs of it, and certainly it's useful to use one of them. However, these graphs are not enough by themselves and you should not settle for a closed monitoring system that only does graphs.

Graphs are good to look at to get a quick overview, and looking at them can show you things that you hadn't noticed before. But there is a lot of questions that you cannot answer and things that you cannot see just from looking at graphs, and many things are hard to see on graphs. Thus, what you really need is for your monitoring system to give you the raw historical data in a documented format so that you can do your own data analysis with whatever stats tools you like. As a bonus, this allows you to graph whatever you want (and on whatever scale you want, and in whatever form of graph you like), instead of having to rely on whatever graphs the monitoring system is willing to give you.

(Note that one reason to use a stats package instead of just reading graphs is that reading (or closely estimating) numbers off graphs is hard. Graphs are designed for overviews and for quickly visualizing things, not for answering questions about specifics. Specifics require either the raw data or direct numbers from doing an analysis of the raw data.)

You might be tempted to say that the monitoring system should do this for you and that the lack of a graph shows that the monitoring system is incomplete. The problem is that any system will always be incomplete for someone, because different places need different statistics.

(The corollary to this is that the best way to view your monitoring system's graphs is as something that covers the easy or obvious cases, not as a comprehensive solution.)


Comments on this page:

From 120.16.195.212 at 2011-09-14 04:52:53:

Also, your monitoring system shouldn't throw out old data and keep averages of it. It's Just Not Worth It these days.

From 71.8.245.7 at 2011-09-14 11:28:42:

So, you're ruling out any RRD-based logging packages here, what monitoring packages keep all samples like that? The only one I'm aware of offhand is Zabbix.

From 150.101.192.193 at 2011-09-14 18:44:23:

You can tell rrdtool to keep full resolution performance data for as long as you want.

Written on 14 September 2011.
« 'Web of trust' is a security failure
How your Linux installer should help you set up filesystems »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Sep 14 02:11:00 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.