When to use drgn instead of eBPF tools like bpftrace, and vice versa

May 8, 2023

I talked recently about drgn and using it to poke around in the kernel, and yesterday I followed that up with an example of finding out which NFS client owns a file lock that used bpftrace (and also I discussed using drgn for this). As an outsider, you might reasonably wonder when you'd use one and when you'd use the other on the kernel. I won't claim that I have a complete answer, but here's what I know so far.

(Both bpftrace and drgn can do things with user programs too, but I haven't tried either for this.)

The simple version is that bpftrace is for doing things when events happen in the kernel and drgn is for pulling information out of kernel variables and data structures. Bpftrace has a crossover ability to pull some information out of some data structures (that's part of what makes it so useful), but often it's much more limited than drgn.

Bpftrace will let you 'trace' kernel events, including events like function calls, and do various things when they happen, such as extracting information from arguments to the events (including function arguments, as we saw with the NFS locks example). However, bpftrace has only limited support for pretty-printing things, limited access to kernel global variables (today it appears unable to access many module globals), and can't do much with kernel data structures like linked lists or per-cpu variables. Bpftrace will work out of the box on almost any modern Linux kernel in its stock setup; at most you'll need the kernel headers.

One painful example of a bpftrace limitation, many interesting kernel data structures contain a 'struct path' that can be used to give you the full path to the object involved, such as a file that's locked, a file being accessed over NFS, or a NFS mount point. Bpftrace generally has very limited ability to traverse these path data structures to turn them into the actual path, while drgn has a simple helper for it.

(One reason for this limitation is that the kernel won't allow eBPF bytecode to have unpredictable, potentially unbounded runtime.)

So, for a non-hypothetical example, if you want to get a top-like view of NFS server activity broken down by user or client, you need bpftrace (see the very impressive nfsdtop), even though some aspects are rather awkward, because you need to 'trace' NFS requests.

Drgn is great for pretty-printing kernel data structures and extracting relatively arbitrary information from them, both for interactive exploration and to be automated in programs. However, the data you're interested in mostly needs to be reachable from some kernel global variable, and figuring out how to get from some global variable to the data you want can be an adventure. In addition, drgn requires per-kernel setup on any machine you want to use it on, because it requires kernel debugging information that most distributions don't install by default.

If both bpftrace and drgn can reach the kernel data structures you're interested in, drgn in interactive mode is generally going to be much more convenient for exploring them. It has much better pretty-printing support, it will readily tell you about all of the types involved, and its interactive mode is much faster than repeatedly modifying and re-starting bpftrace programs to print a few more things.

However, if you want to inspect short-lived objects, for example ones that are only passed around as function arguments and are deallocated when the operation is over, you need bpftrace. A short lived, dynamically allocated object is beyond drgn's feasible reach. As an example, if you want to snoop into the data structures that NFS servers use to represent requests from NFS clients while the requests are being processed, you're going to need bpftrace.

(If you have a hybrid situation where there is a long lived data structure that isn't reachable from global variables, I suppose you could get bpftrace to print its address as exposed during a function call, then immediately turn to drgn to start dumping memory.)

Comments on this page:

So a crude and somewhat misleading drastically oversimplified pithy summary might be that drgn is for the heap while EBPF is for the stack?

By cks at 2023-05-22 13:09:01:

If I was going to try to put it in that way, I'd say that drgn is for globals while EBPF is for function arguments (I wish I could say 'locals' to make it more pithy, but in practice EBPF doesn't really have access to them).

Ah. I consciously chose “heap” because by “globals” I was only thinking of statically allocated variables (think: data and BSS segments, essentially) – only in writing this down just now did I notice that this is of course incorrect. Nor did I catch that “stack” implied locals as well, so in fact it was not just half my suggestion which was misleading.

I like that yours, while a simplification (as any summary this short must be), manages to avoid this sort inaccuracy entirely. I just assumed that not to be possible. When it is, that gain in accuracy more than makes up for the loss in pithiness IMO.

Written on 08 May 2023.
« Advisory file locks and mandatory file locks are two quite different things
Curing my home desktop from locking up in the cold (so far) »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon May 8 23:14:43 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.