2023-06-11
The potential risks of using (Open)ZFS On Linux with at least NFS
I've written in the past about how we've had problems with Maildir format mail storage and how making /var/mail local to our IMAP server was a significant improvement. A common element to both of these issues that our NFS fileservers use (Open)ZFS On Linux. Over time, I've come to feel that this represents a potential risk factor for our environment.
Now, on the one hand we might have seen these issues with NFS regardless of the underlying filesystem on the fileservers, even with a well supported filesystem like ext4. On the other hand, we've seen other NFS performance oddities with our NFS fileservers, and ZFS is an unusual 'filesystem' that may interact with NFS IO in odd ways. Unlike many filesystems, ZFS has large scale structures that are used to aggregate IO (in the form of ZFS pools) and it doesn't really present this to the kernel in any way that's legible to the rest of the kernel. And my guess is that NFS serving with ZoL is less common than other uses of ZoL (partly because NFS is getting less common in general).
With a convention NFS server filesystem stack, such as ext4 on LVM on software RAID, everything is in the kernel and you can ask kernel people for help, report issues you see, and so on. If something is going wrong that creates sub-par performance, the kernel people will probably want to fix it. But (Open)ZFS On Linux is outside the kernel, so Linux kernel people have little reason to particularly help out and ZoL people may not have the capabilities to dig into the kernel NFS and disk IO stacks to understand what's going on (it's a bit out of scope), and even if a problem can be identified there may not be any good fix. One reason for this is that the actual code of ZFS On Linux is also mostly Solaris/Illumos code, which creates a mismatch between the kernel and ZFS (one of the areas where this is still quite visible is memory issues with ZFS's ARC).
This sort of thing is probably not a big risk for most people. ZFS On Linux is highly likely to always be functional and quite likely to always perform well in ordinary circumstances, since the first is absolutely necessary and the second is quite popular. Our issues are performance issues under what appears to be significant load. Most people don't push their systems that hard (I don't on my desktops, where I use ZFS On Linux without particular performance issues).
Even with this risk, ZFS On Linux is more than worth it for us in our environment. We get various sorts of benefits from using ZFS that would be hard to replicate with any other setup, and the performance we get is good enough. Everything is a tradeoff. But the risk is something I want to be honest about. It's also something I want to keep in mind if we see performance oddities in the future, or are planning something that needs high IO performance.