Chris's Wiki :: blog/sysadmin/StatsGatheringGoals Commentshttps://utcc.utoronto.ca/~cks/space/blog/sysadmin/StatsGatheringGoals?atomcommentsDWiki2014-04-10T16:27:58ZRecent comments in Chris's Wiki :: blog/sysadmin/StatsGatheringGoals.By Chris Siebenmann on /blog/sysadmin/StatsGatheringGoalstag:CSpace:blog/sysadmin/StatsGatheringGoals:ef0a6cd7b20f0fb0df58b23c66469d8396c47269Chris Siebenmann<div class="wikitext"><p>My understanding is that graphite is simply a backend that handles
receiving, storing, and displaying timeseries data, so if you can
generate things that are 'timestamp, metric name, value' you can send
this into graphite. Normally you'll run a single graphite (logical)
instance to accept these timeseries from all of your machines and other
stats generation sources, instead of one per machine. Since this is
close to the fundamental data that PCP deals with, you can presumably
get the raw-ish data out of PCP somehow and feed it to graphite with
sufficient hacks.</p>
<p>There's actually OmniOS packages for relatively current versions of
collectd, which I was pleased to see. Otherwise you get to install
it from source.</p>
</div>2014-04-10T16:27:58ZBy Josef "Jeff" Sipek on /blog/sysadmin/StatsGatheringGoalstag:CSpace:blog/sysadmin/StatsGatheringGoals:6f24d5d86d35ff48ba4627e44496deeefa3500d8Josef "Jeff" Sipekhttp://blahg.josefsipek.net<div class="wikitext"><p>I realize I'm commenting a bit too late. (Since Chris already decided to use graphite.)</p>
<p>PCP is very easy to set up. You just install the packages and you have a base system all set up: <a href="http://blahg.josefsipek.net/test/?p=437">http://blahg.josefsipek.net/test/?p=437</a> & <a href="http://blahg.josefsipek.net/test/?p=438">http://blahg.josefsipek.net/test/?p=438</a></p>
<p>I'm perplexed by the implication that well documented software is harder to set up. Had you not discovered the documentation, would you have thought that PCP was easier to set up?</p>
<p>Yes, I have to admit that feeding new metrics (that no one has implemented a PMDA for) is more complicated than in graphite - you have to write a PMDA. Nowadays you have a choice between Perl, C, and Python (IIRC it's stable now) and since there are existing PMDAs, examples and documentation, you can do so very easily. (Once upon a time, I wrote a gpsd PMDA because I wanted to use PCP to log my coordinates during a road trip.)</p>
<p>I don't know how easy it is to install collectd. Well, I'd expect installing it on Linux to be trivial. How about on Solaris 10 or OmniOS? PCP just works on those :)</p>
<p>I think it is a shame that PCP isn't more well known. I've used it for long-term system monitoring (5-min logging interval) as well as debugging performance issues (1-millisecond logging interval). The logs of course grow faster the more often you log.</p>
<p>I'm not really familiar with graphite, but I wonder how hard it'd be to combine the cross-platform logging awesomeness of PCP with the graphing ability of graphite.</p>
</div>2014-04-10T12:51:46ZBy choffee on /blog/sysadmin/StatsGatheringGoalstag:CSpace:blog/sysadmin/StatsGatheringGoals:b1249cb4d453236a6adf68613eec6123cf137f23choffeehttp://choffee.co.uk/<div class="wikitext"><p>Hi Chris,</p>
<p>I second the use of graphite and statsd. It's a good central tool for gathering and displaying stats from all sorts of services.</p>
<p>One source that seems to work pretty well for us is using collectd on the servers but rather than keeping the stats locally in rrd sending them all back to a central graphite server.</p>
<p>You can then add in other metrics from other systems, even just adhoc scripts on a box sending back to graphite using netcat.</p>
<p>The interface is a little quirky but very powerful and we use the dashboards quite a bit for storing groups of graphs. There are a load of other dashboards that you can hook up to it to or just dump the data that would make the graph out and process it yourself.</p>
</div>2014-04-09T14:23:23ZBy erlogan on /blog/sysadmin/StatsGatheringGoalstag:CSpace:blog/sysadmin/StatsGatheringGoals:3fcef7b11e8894264decca5d149b0717a348318aerlogan<div class="wikitext"><p>The venerable Cacti has been my tool of choice for this type of thing in the past. Cacti is probably best working from SNMP queries, but it's fairly straightforward to get it to track and plot arbitrary data generated from a script. I used NRPE to run reporting scripts remotely (since I was already using nagios), but you could just as easily use something like xinetd.</p>
</div>2014-04-08T18:12:26ZBy dozzie on /blog/sysadmin/StatsGatheringGoalstag:CSpace:blog/sysadmin/StatsGatheringGoals:a9705ea605f17c2e6d8024034f80a858c9b29d6cdozzie<div class="wikitext"><p>...and how much trouble is to setup PCP? As I remember, PCP comes with scary
160-page instruction.</p>
<p>Chris, you could start with Graphite. It shouldn't be difficult to install[*],
is <a href="http://graphite.wikidot.com/getting-your-data-into-graphite">ridiculously easy</a>
to start collecting data (just open a socket to graphite:2003 and send a line
"foo.bar.baz 10 1396968142\n" (metric's tag, the metric itself and its
timestamp)) and graphite-web gives an easy start for plotting the graphs.</p>
<p>All you need running is:</p>
<ul><li>carbon-cache</li>
<li>graphite-web (quite typical Django application)</li>
<li>some script to collect stats and to send them to carbon-cache</li>
</ul>
<p>[*] I have just spent about 30 minutes backporting <em>graphite-web</em>,
<em>graphite-carbon</em> and <em>python-whisper</em> packages from Debian unstable to Debian
oldstable. I took some drastic shortcuts, like assuming Django 1.2 will make
up for Django 1.6+ (it mostly did) or ignoring <em>libjs-jquery-flut</em> package
whatsoever (it bit me in Graphlot part, but it doesn't seem required to just
use graphite-web), but the whole thing seems to just work. On Ubuntu Saucy it
should be even easier, since all the packages are in universe repository.</p>
<p>What do you think, Chris? Can you spare half an hour to setup Graphite? You
can even skip starting regular WWW server if you use uWSGI with config like
this:</p>
<pre>
; graphite-web.ini
[uwsgi]
uid = _graphite
gid = _graphite
plugins = python,http
http = 0.0.0.0:8180
mount = /=/usr/share/graphite-web/graphite.wsgi
</pre>
</div>2014-04-08T15:20:02ZBy Josef "Jeff" Sipek on /blog/sysadmin/StatsGatheringGoalstag:CSpace:blog/sysadmin/StatsGatheringGoals:6935f85da850f8b8341c87f4a49a51809be566bcJosef "Jeff" Sipekhttp://blahg.josefsipek.net<div class="wikitext"><p>I have to mention SGI's Performance Co-Pilot again. (I mentioned it in a comment a few months ago.) In your case, I'd set it up to collect data every 5 mins (or whatever interval you want) and rotate the archives every month. You can either log locally or via TCP. The logs don't lose resolution over time (like rrdtool) so you always have the same resolution. The logs compress rather well, so you could get away with storing years of logs on a compressed ZFS dataset.</p>
<p>I don't know if the graphing front changed in the past couple of years, but that used to be the worst part of PCP. It has pmchart which is an interactive GUI app that lets you explore logs, but nothing to generate (pretty) static images with archive data. There have been a number of changes since that may make dashboard-making easier, I just don't know.</p>
<p>You can easily run it on both Linux and Solaris/OmniOS and get at hundreds of metrics without any effort.</p>
</div>2014-04-08T13:37:05ZBy Anonymous on /blog/sysadmin/StatsGatheringGoalstag:CSpace:blog/sysadmin/StatsGatheringGoals:aba366dea15845f47a1c0fdaff119cc09710b57aAnonymous<div class="wikitext"><p>Have you come across Performance Co-Pilot? It would allow inspecting both historical and live data and it seems to have some level of NFS support but I haven't tested NFS side myself yet.</p>
<p><a href="http://www.performancecopilot.org/">http://www.performancecopilot.org/</a></p>
</div>2014-04-08T06:47:14Z