DTrace: two mistakes you can make in DTrace scriptsOctober 14, 2012
There are certain mistakes that you can make when writing DTrace scripts that are worth pointing out, partly because I've made them myself and partly because I've seen them in other people's scripts on the net. To serve as an example of this, suppose that you want to track ZFS read and write volume, so you write the following:
This is perfectly good DTrace code and it works fine most of the time, but it has two significant issues. The first is what I'll call the initiated versus completed problem.
Because we are tracing on entry to When to account for activity is in general a hard problem with no clear generic answers; how you want to do it can depend very much on what questions you want answers to. But my default assumption is to account for things on completion, especially for IO, because that's when a user-level program can go on; it's also clearly counting the past instead of anticipating the future. You may have different answers, but you definitely want to think about the issue. And if you pick 'counted when initiated', you probably should write some sort of comment about why this makes sense in your particular case. The second problem is that this blithely ignores the possibility of
errors of all sorts. Both (Note that you don't need to run into actual errors to get incorrect results with reading, because short reads can easily happen; consider issuing a large read when you're almost at the end of a file.) Unfortunately, knowing what sort of errors are possible and how to compensate for them generally takes reading (Open)Solaris kernel source and probably some experimentation. I have to admit that right now I don't know how to do fully correct IO size accounting for these routines, for example, because I'm not sure how to get the actual amount read or written for short reads and writes. It may be that these are uncommon enough that you can ignore them and just handle clear errors. (One of the pernicious issues with both of these mistakes is that much of the time they won't matter. If your IO almost never results in errors or short IO, completes rapidly, and you're only looking at IO volume over a long time, you're not going to be affected. I'm lucky (in a sense) that I've been dealing with a number of situations where all of this was not true so I've been smacked in the nose by this.) |
These are my WanderingThoughts GettingAround This is part of CSpace, and is written by ChrisSiebenmann. * * * Atom feeds are available; see the bottom of most pages. Categories: links, linux, programming, python, snark, solaris, spam, sysadmin, tech, unix, web |