Wandering Thoughts archives

2012-10-03

DTrace: notes on quantization and aggregates

Following on the same theme as yesterday, here's some things about DTrace's aggregates and quantization that I want to remember. As a bonus I'm going to throw in random bits and pieces about using printa().

(I should mention that all of this is on Solaris 10 update 8. Yes, this is almost fossilized by now. No, we're not interested in upgrading.)

As I mentioned yesterday, I've come around to the view that you should never use the anonymous aggregate @[...]. Using a named aggregate is just as easy and it avoids all sorts of really annoying errors when you later try to reuse @[...] for a second quick aggregation in another section of the script with different keys or quantization. (I've done that.)

The standard D style is to use an aggregate in only one place or for only one thing. However, this is not actually required. As far as I can tell, you can stuff anything in an aggregate provided that it uses the same number and type of keys and the same aggregation function; for lquantize(), the sizing parameters must also match. What this turns aggregates into is a way of grouping things together (or by deliberately changing aggregate names, they become a way of splitting things apart).

(It's useful to know that things like probefunc are considered strings, so literal strings as keys are compatible with them.)

In particular, if you're aggregating and quantizing times (durations, latencies, whatever) it's perfectly fine to have different things in different time units. You might measure some things in milliseconds, some things in microseconds, and some things in quarter milliseconds for an intermediate resolution. They can all go in the same aggregate, even. You'll want to make sure that the key includes some indirection of the (time) unit scale or you'll probably get very confused.

Similarly it's perfectly fine to repeatedly quantize the same data in different ways in different aggregates, so you're taking both a general quantize() overview and then using lquantize() to example specific sections in detail. The one drawback to doing this the straightforward way is that all of the aggregation will include all of the data. This can can make DTrace's ASCII bar graphs kind of useless because your actual data will be crushed by the out-of-range values on either side. Getting around this means recording only selected data in the aggregate, which means using a separate clause so you can make it conditional so it only applies to values inside your range of interest.

To give you an example:

fbt::rfs3_*:entry
{ self->ts = timestamp; 
  @counts[probefunc] = count();
}

fbt::rfs3_*:return
/ self->ts > 0 /
{ this->delta = (timestamp-self->ts)/1000000; }

fbt::rfs3_*:return
/ self->ts > 0 && this->delta > 100 /
{ @slow[probefunc, "ms"] = lquantize(this->delta, 100, 500, 50); }

fbt::rfs3_*:return
/ self->ts > 0 /
{ @dist[probefunc, "ms"] = quantize(this->delta);
  self->ts = 0;
}

(I'm using what I suspect is an obscure feature of this to save some space and verbosity in this example. It's not necessarily good style. Also notice how I've embedded the time unit in the aggregation key so that it'll normally show up when the aggregate is printed.)

Here we're both getting a general distribution over all NFS v3 server operations (more or less) and also getting a look at roughly how those between 100 msec and 500 msec break down. If we included all data in the lquantize() version it would be very hard to see any pattern in the 100-500 msec breakdown because almost all of the data points would be under 100 msec (well, we hope).

(PS: that this actually reports results on our systems is one of the reasons that I am now digging into this stuff.)

Finally, even with the command line switch or pragma to quiet dtrace down, every aggregation with data will be printed, either with a printa() by you or by DTrace itself if you don't print it explicitly. You can probably get out of this with an END clause that clear()'s aggregations that you don't want to be displayed. I believe that individual entries in an aggregate are printed in order of increasing numbers. This gets a little odd for quantizations; there I suspect but don't know for sure that the order is determined by something like the sum of all values submitted to the quantization. It's not particularly useful in any case.

Sidebar: a bit on printa()

The basic version of printa() with a format string is:

END
{ printf("\n");
  printa("function %-20s  %@10d\n", @counts);
  printf("\n");
  printa("function %s(), time in %s:%@d\n", @dist);
  printf("\n");
  printa("function %s(), time in %s for >= 100 ms:%@d\n", @slow);
}

The %@... bit is the format of the aggregate, and can occur anywhere in the format string; in practice the only useful thing to do to it is to give a field width (as this does for @counts). The other % formatting is for each element of the key, which are used in order but don't have to be completely used up.

The printf("\n")s are normally necessary because printa() without a formatting string adds an extra newline before and after printing the entire aggregate. With a format string, it doesn't. The newline in the format strings simply separates individual keys from the aggregate.

(This behavior of printa() is inconsistent, because if you just let DTrace dump the aggregates out it doesn't add the newline.)

solaris/DTraceQuantizationNotes written at 01:40:20; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.