ZFS is not a universal filesystem that is always good for all workloads

August 30, 2019

Every so often, people show up on various ZFS mailing lists with problems where ZFS is performing not just a bit worse than other filesystems or the raw disks, but a lot worse. Often although not always, these people are using raidz on hard disks and trying to do random IO, which doesn't work very well because of various ZFS decisions. When this happens, whatever their configuration and workload, the people who are trying out ZFS are surprised, and this surprise is reasonable. Most filesystems today are generally good and also generally have relatively flat performance characteristics, where you can't make them really bad unless you have very unusual and demanding workloads.

Unfortunately, ZFS is not like this today. For all that I like it a lot, I have to accept the reality that ZFS is not a universal filesystem that works fine in all reasonable configurations and under all reasonable workloads. ZFS usually works great for many real world workloads (ours included), but there are perfectly reasonable setups where it will fall down, especially if you're using hard drives instead of SSDs. Raidz is merely an unusually catastrophic case (and an unusually common one, partly because no one expects RAID-5/6 to have that kind of drawback).

(Many of the issues that cause ZFS problems are baked into its fundamental design, but as storage gets faster and faster their effects are likely to diminish a lot for most systems. There is a difference between 10,000 IOPs a second and 100,000, but it may not matter as much as a difference between 100 a second and 1,000. And not all of the issues are about performance; there is also, for example, that there's no great solution to shrinking a ZFS pool. In some environments that will matter a lot.)

People sometimes agonize about this and devote a lot of effort to pushing water uphill. It's a natural reaction, especially among fans of ZFS (which includes me), but I've come to think that it's better to quickly identify situations where ZFS is not a good fit and recommend that people move to another filesystem and storage system. Sometimes we can make ZFS fit better with some tuning, but I'm not convinced that even that is a good idea; tuning is often fragile, partly because it's often relatively specific to your current workload. Sometimes the advantages of ZFS are worth going through the hassle and risk of tuning things like ZFS's recordsize, but not always.

(Having to tune has all sorts of operational impacts, especially since some things can only be tuned on a per-filesystem or even per-pool basis.)

PS: The obvious question is what ZFS is and isn't good for, and that I don't have nice convenient answers for. I know some pain points, such as raidz on HDs with random IO and the lack of shrinking, and others you can spot by looking for 'you should tune ZFS if you're doing <X>' advice, but that's not a complete set. And of course some of the issues today are simply problems with current implementations and will get better over time. Anything involving memory usage is probably one of them, for obvious reasons.

Written on 30 August 2019.
« How I'm dealing with my Python indentation problem in GNU Emacs
The sorts of email attachments that we get these days have become boring »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Aug 30 22:06:47 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.