One reason why having metrics is important

November 29, 2012

Put simply, one of the reasons that having metrics is important is that metrics give you a backup check on changes that you think are harmless.

In many environments you can't exhaustively check the effects of every change that you do. Trying to do performance checks all of the time is simply both too time consuming and too much like monkey-work (and people tune out of monkey work). Sooner or later you'll start deciding that some changes are safe enough that you can skip some or all of your checks, or you'll just let them slip because there's something that seems more urgent right now but you'll get to them later (honest).

(The more work your checks are to do and the more genuinely harmless changes you make, the sooner this happens. Humans have a quite strong drive to avoid useless and pointless work.)

The advantage of automatic, constant collection of metrics is that all of this is handled for you, whether or not you think you need it and whether or not you remember (and can be bothered). It even happens without you having to do anything. This is in a sense not as good as explicit performance checks (which may give you more information and which you're going to look at right away, not maybe later) but it's a lot better than nothing and often this is the real choice.

I know that I'm late to the party on this, but sometimes it takes a while for things to sink through my skull.

(Perry Lorier noted effectively this in a comment on yesterday's entry.)

As an obvious side note: this is of course closely related to the benefits of automatic (and fast-running) unit and other tests for programmers. In both cases we're trying to make something automatic and cheap instead of manual and expensive so that it will get done all the time no matter what instead of being at the mercy of people's whims. What sysadmins do is less amenable to unit tests but more amenable to constant live monitoring, so we can get the same effects (especially if it's combined with alerting, as noted by Perry Lorier).

Written on 29 November 2012.
« When you make a harmless change, check to make sure that it is
My new view on why you need to profile code »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Nov 29 00:41:26 2012
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.