Practical issues with getting ZFS on Linux

January 17, 2009

When discussions of ZFS on Linux (for real, as more than a user-level filesystem) come up, the usual issue that gets brought up is the licensing problem; Sun's CDDL is incompatible with the kernel's GPL requirement. But Sun could always change that if they wanted to, and I think there's another, more serious problem.

To put it simply, my impression is that the Linux kernel people are generally strongly opposed to what I could call 'code drops', where foreign code is parachuted into the Linux kernel. They want code in the Linux kernel to be real Linux kernel code, in other words to look like it was actually written for the Linux kernel, to be in the same style and use the same idioms as other Linux kernel code. They do not want compatibility layers, a completely different style than the rest of the kernel, and so on.

(The reasons for this are very sensible; 'foreign' code imposes a maintenance cost on everyone who has to deal with it, and if it is in the Linux kernel that potentially means every kernel developer.)

An approach where the hypothetical ZFS in Linux code was basically the Solaris ZFS code base with a compatibility layer to provide Solaris kernel APIs and suchlike on Linux would be unlikely to be accepted by the kernel developers; from their perspective, the long term costs imposed by such an approach aren't worth the gains. To get ZFS into Linux, it would almost certainly need to be significantly modified in order to fit into the rest of the Linux kernel code.

(This isn't just a matter of reformatting the code and calling different functions for things like memory allocation. How the Linux kernel likes to do things is almost certainly significantly different from how the Solaris kernel works, so the code would probably require significant structural modifications to work the Linux way.)

This has two problems. The lesser one is that it's a lot of work, much of it grindingly picky and uninteresting, that needs to be done by someone with enough Linux kernel experience to write code that fits nicely into the Linux kernel. The bigger one is that such a code divergence between 'Solaris ZFS' and 'Linux ZFS' would make it hard to keep the Linux code up to date (or to adopt fixes from Linux back in to the main code base), which implies a lot of work on an ongoing basis (and creates practical concerns for people thinking of using Linux ZFS).

(The one example of something similar to this being tried is SGI's work to get XFS into the kernel. In the end I believe that it took years of significant work on SGI's part, and that it did indeed require restructuring how the code worked. I don't know if SGI was able to maintain much commonality between the Irix XFS code and the Linux XFS code, or if they basically forked once and stayed diverged.)

Written on 17 January 2009.
« A lament about modern NFS development
The basic implementation of relatively high-availability NFS »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Jan 17 03:16:12 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.