2018-02-15
DTrace being GPL (and thrown into a Linux kernel) is just the start
The exciting news of the recent time interval comes from Mark J. Wielaard's dtrace for linux; Oracle does the right thing. To summarize the big news, I'll just quote from the Oracle kernel commit message:
This changeset integrates DTrace module sources into the main kernel source tree under the GPLv2 license. [...]
This is exciting news and I don't want to rain on anyone's parade, but it's pretty unlikely that we're going to see DTrace in the Linux kernel any time soon (either the kernel.org main tree or in distribution versions). DTrace being GPL compatible is just the minimum prerequisite for it to ever be in the main kernel, and Oracle putting it in their kernel only helps move things forward so much.
The first problem is simply the issue of integrating foreign code originally written for another Unix into the Linux kernel. For excellent reasons, the Linux kernel people have historically been opposed to what I've called 'code drops', where foreign code is simply parachuted into the kernel more or less intact with some sort of compatibility layer or set of shims. Getting them to accept DTrace is very likely to require modifying DTrace to be real native Linux kernel code that does things in the Linux kernel way and so on. This is a bunch of work, which means that it requires people who are interested in doing the work (and who can navigate the politics of doing so).
(I wrote more on this general issue when I talked about practical issues with getting ZFS into the main Linux kernel many years ago.)
Oracle could do this work, and it's certainly a good sign that they've at least got DTrace running in their own kernel. But since it is their own vendor kernel, Oracle may have just done a code drop instead of a real port into the kernel. Even if they've tried to do a port, similar efforts in the past (most prominently with XFS) took a fairly long time and a significant amount of work before the code passed muster with the Linux kernel community and was accepted into the main kernel.
A larger issue is whether DTrace would even be accepted in any form. At this point the Linux kernel has a number of tracing systems, so the addition of yet another one with yet another set of hooks and so on might not be viewed with particularly great enthusiasm by the Linux kernel people. Their entirely sensible answer to 'we want to use DTrace' might be 'use your time and energy to improve existing facilities and then implement the DTrace user level commands on top of them'. If Oracle followed through on this, we would effectively still get DTrace in the end (I don't care how it works inside the kernel if it works), but this also might cause Oracle to not bother trying to upstream DTrace. From Oracle's perspective, putting a relatively clean and maintainable patchset into their vendor kernel is quite possibly good enough.
(It's also possible that this is the right answer at a technical level. The Linux kernel probably doesn't need three or four different tracing systems that mostly duplicate each others work, or even two systems that do. Reimplementing the DTrace language and tools on top of, say, kprobes and eBPF would not be as cool as porting DTrace into the kernel, but it might be better.)
Given all of the things in the way of DTrace being in the main kernel, getting it included is unlikely to be a fast process (if it does happen). Oracle is probably more familiar with how to work with the main Linux kernel community than SGI was with XFS, but I would still be amazed if getting DTrace into the Linux kernel took less than a year. Then it would take more time before that kernel started making it into Linux distributions (and before distributions started enabling DTrace and shipping DTrace tools). So even if it happens, I don't expect to be able to use DTrace on Linux for at least the next few years.
(Ironically the fastest way to be able to 'use DTrace' would be for someone to create a version of the language and tools that sat on top of existing Linux kernel tracing stuff. Shipping new user-level programs is fast, and you can always build them yourself.)
PS: To be explicit, I would love to be able to use the DTrace language or something like it to write Linux tracing stuff. I may have had my issues with D, but as far as I can tell it's still a far more casually usable environment for this stuff than anything Linux currently has (although Linux is ahead in some ways, since it's easier to do sophisticated user-level processing of kernel tracing results).