Wandering Thoughts archives

2012-06-29

More about my issues with DTrace's language

In his comment on my entry about why we haven't taken to DTrace, Brendan Gregg wrote in part:

It's been mentioned a few times, but I suspect it would be possible to create a higher level language ("D++") that speaks to libdtrace. An advantage of D being low level is that the user is conscious of how the system is actually getting traced, in the same way that C is low level. [...]

I don't think that this is the case. In fact I think it works the other way around; I doubt very much that people can go from D's limitations to any real understanding how the system is traced, but if you know how DTrace is implemented you can see the bones of this implementation underneath some of D's oddities.

To start with, I will agree that making some things clear is useful and even important. For example, I think that access to kernel data and variables should look different than access to user level data (and in a way that makes access to user level data look more expensive). What I object to is things that D makes pointlessly difficult, things where it doesn't support the obvious simple way of doing whatever and forces you to be indirect. The shining example of this is conditionals. D does not have any form of an if statement that you can use in the action that fires for a particular probe; however, probes themselves can be conditional, based on an expression. So you're left to fake an if by writing your entire probe action twice (and yes, I've done this in DTrace code).

The story I remember hearing about why this limitation exists is that the DTrace implementation doesn't want to be dynamically allocating output buffer space as a probe action executes; it wants to allocate the space once, before the probe's action starts. Well, fine, but if this is the reason you can deal with it in if-using D code by allocating the maximum amount of space the code might need if it followed the most pessimistic, space consuming path through the conditionals. Alternately, you could transform ifs by automatically creating multiple specific probes with probe conditions. Forcing DTrace users to duplicate their code in order to do this by hand is perverse, or at least an excessive focus on being literal about how DTrace's internals behave.

(You can argue that it saves users from themselves under some circumstances, for example if a rare condition requires a bunch more buffer space than the common ones. But this is an optimization and generally a premature one.)

Now, this story is clearly not the complete explanation given that DTrace has plenty of things that certainly look like they create variable sized output (including an outright ternary ?: operator). This pretty much illustrates my point, in that running into this D constraint hasn't made me any better informed than before about how the system is actually traced. It's still a black box, it's just a more frustrating black box.

DTraceLanguageCriticism written at 01:52:20; Add Comment

2012-06-22

The effects of DTrace's problems

A while back, Brendan Gregg left a comment on my entry about why we haven't taken to DTrace. In one part, he asked:

I was a little confused at first about the language and documentation issues [with DTrace]. Usually language discussions [...] are intended to pick one over another, but in this case there is no other option to choose from. So if these problems are raising the barrier to entry for some people, and they aren't entering, then what are they doing? Leaving system problems unsolved?

There are two answers to this.

The first answer is that DTrace is two things at once; it is both a way of diagnosing problems on Solaris and a potential way of attracting people to Solaris (and all its variants), to continue to use it, and to use it for more things. Let us focus on the latter thing for the moment. When and where DTrace is hard to use it becomes less attractive; at the limit, if you feel that you can't really use DTrace for anything it ceases to become an advantage for Solaris at all. If you want Solaris to succeed as an OS, this should matter.

(I think that the theoretical advantages of having DTrace may have been oversold in general. As a pragmatic matter I think that most people don't expect to have system problems (they expect the system to just work), so I suspect that they drastically discount the availability of good diagnosis tools because they expect to not need any. People who know that they are running at the ragged edge of performance will have a different opinion, but many people are not in this situation.)

The second answer, put simply, is yes; sysadmins are leaving system problems unsolved because DTrace is too hard to use. Not the big crippling problems, of course, because those are the problems you have no choice about solving. But smaller problems, the little glitches that happen sometimes or the relatively low impact performance degradations? Yes, some of them are going unsolved. Also going unsolved are the problems that people don't even know they have because they've never looked, the ones where people have no idea that something is actually wrong and their system could work better with some changes. DTrace being hard(er) to use is especially damaging to the latter because of course if you don't think you have a problem, the cost to benefit ratio of looking into your system appears infinite.

(I've argued that this is not actually the case, but I think it's at least a very hard thing to sell. Especially to overworked sysadmins with other issues to tackle when you are asking them to invest a significant chunk of time.)

DTraceProblemEffects written at 00:00:52; Add Comment

By day for June 2012: 22 29; before June; after June.

Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.