Some notes on adding exposed statistics to a (Go) program
As a slow little project for the past while, I have been adding some accessible statistics to my sinkhole SMTP server, using Go's expvar package. This has resulted in me learning lessons both about expvar in specific and the process of adding statistics in general.
My big learning experience is going to sound fairly obvious and trite: I only really figured out what statistics I wanted to expose through experimentation. I started out with the idea that counting some obvious things would be interesting (and to a certain extent they were), but I created many of the lot of stats by a process of looking at the current set and realizing that there was information I wanted to know or questions that I wanted answered that were not covered by existing things I was exposing. Sometimes trying to use the initial version of statistic showed me that it was too broad or needed some additional information in order to be useful.
The corollary to this is that what statistics you'll want depends in large part on what questions are interesting and informative for you, which depends on how you're using the program. A lot of my stats are focused on anti-spam related issues, because that's how I'm using my sinkhole SMTP server. Someone using it to collect email from a collection of nodes and tests might well want a significantly different set of statistics. This does make adding stats to a theoretically general program a somewhat tricky thing; I have no good answers to this currently.
(I have not tried to be particularly general in my current set of stats. Since this has been an experiment to play around with the idea, I've focused on making them interesting to me.)
Just exporting statistics from a program is less general than pushing
events and metrics into a full time series based metrics system,
expvar package and a few other tools like
jq makes it
much easier to do the former (for a start, I don't need a metrics
system). Exporting statistics is also not as comprehensive as having
an event log or the like. Since I do sort of have an event log,
I've chosen to view my
expvar stats as being an on-demand summary
of it, one that I can look at without having to actively parse the
log to count things up.
And on another obvious note, putting counters and so on in a hierarchical namespace is quite helpful for keeping things comprehensible and organized. To some extent a good hierarchy can substitute for not being able to come up with great names for individual statistics. And sometimes you have data with unpredictable names that has to be confined to a namespace.
(For instance, I track DNS blocklist hit counts. The names of DNS
blocklists are essentially arbitrary, so I put the whole set of
stats into a
dnsbl_hits namespace. And because the expvar
package automatically publishes some general Go stats on things
like your program's memory usage, I put all of my stats under a
top-level name so it's easy to pick them out.)