I find Systemtap oddly frustrating

May 30, 2013

I currently have a ZFS on Linux performance mystery with sequential NFS writes. One of the things that I want to do to diagnose it is to get a trace of NFS client activity so that I can see exactly what is slow and when. In theory I could reconstruct this from sufficient analysis of the TCP stream; in practice I couldn't make Wireshark do this with some brief poking and this seemed like a good time to learn Systemtap (after all, DTrace can definitely do this sort of stuff with effort).

The result has been surprisingly frustrating, especially when I compare it with my DTrace experience. Before DTrace fans start celebrating too much I think that one reason that DTrace was less frustrating for me is that it so obviously threw me to the wolves very rapidly. DTrace had only a massive manual and within a very short time of poking around with it was apparent that it had nothing to help with NFS activity tracing and I was going to have to read Solaris kernel source.

Systemtap has a lot more attempts at helpful documentation than DTrace does but so far none of them have been led me to solve my problem. I still keep reading, because how can I resist a beginner's guide? After all, I am a Systemtap beginner.

This feeds into the additional frustration that is tapsets. Tapsets are the rough equivalent of DTrace providers, except that DTrace providers are limited, hardcoded into the kernel, and documented. Systemtap tapsets can basically be programmed in Systemtap itself, building interesting advanced capabilities on top of basic ones, and you have the source code. The tantalizing source code that may be most of the documentation you have on what an interesting looking tapset might be able to do for you.

(Things provided by standard tapsets are documented here.)

There are other, lesser frustrations. I can boil them all down to Systemtap having a lot of nice features that it doesn't bother to carry all the way through (both in the core Systemtap and especially in tapsets). DTrace is limited in comparison but at least it's pretty honest about its limitations.

(All of this is a very personal reaction to Systemtap, born of the annoyance I'm currently feeling every time I try to spend more time on my NFS monitoring project. I'm sure that there are plenty of people who are very happy with SystemTap.)

Written on 30 May 2013.
« How you should package local-use configuration files
Understanding the MongoDB code that people are laughing at »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu May 30 00:39:43 2013
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.