What does it mean for a filesystem to perform well on modern hardware?

February 7, 2022

Once upon a time, back in the days of spinning rust, whether or not you were getting good filesystem performance was sort of a straightforward question. Disks were slow and couldn't do very many seeks per second, so you could assess the performance of a filesystem by how close it got you to the raw disk read and write speed for sequential IO, or the raw disk seek limits for random IO. These days 'SSDs' (which is to say SATA and SAS SSDs) and especially NVMe drives have in one sense complicated the question drastically, along two dimensions.

First, modern operating systems can't necessarily even reach the raw performance that modern NVMe drives are capable of, especially through ordinary interfaces. When they do, it's a relatively recent development, and the internal kernel interfaces are likely not in place for filesystems to drive this sort of performance even in the best case. And the best case may be difficult to get to (for example, requiring large queue depths and large amounts of requests in flight). Serial attached SSDs (SATA and SAS) have lower limits for both bandwidth and IOPS, but even then it may be hard to hit their maximum performance under realistic situations even with an ideal filesystem.

Second, there is the question of how much performance you actually can use (or need) and the resulting question of how much differences among filesystems matter. Ultimately this is partially a question of Amdahl's law as applied to IO. If the kernel IO time dropped to zero (so that every IO operation was satisfied the moment it was made), there are plenty of programs that would not necessarily get much faster than they already are on most filesystems on most NVMe drives. Sometimes this is because IO is a relatively small portion of the program's operation; sometimes this is because the program is written in such a way that, for example, it does random IO with a single request at a time.

(One answer to this is to measure performance with common programs, since what ultimately matters is how the programs you're going to actually use behave. But this raises the question of what good performance for them looks like.)

All else being equal, more performance is always useful, even if it's just potential performance with programs written in just the right way. But all else isn't necessarily equal, since modern filesystems (and operating systems) differ in potentially important ways other than just performance. If you can establish a point where filesystems are "performing well", you can perhaps stop being concerned about just how well. But that leaves the question of how to decide on that point.

(I would like to think that a modern operating system could get more or less the full SATA bandwidth from a single SSD through any decent filesystem for streaming reads. But I haven't actually tried to test that. And I have no idea how that would go with NVMe drives.)

Comments on this page:

And then there is the fact SSDs are complex virtualized devices trying to mask the fact they use NAND flash under the hood with its complex erase-rewrite cycles and write amplification, and the block device abstraction fails with sustained writes. Consumer drives will typically throttle writes fairly quickly, enterprise drives are usually overprovisioned and sustained performance is better.

Some modern operating systems like Fedora have even started to introduce filesystem compression on by default, see https://fedoramagazine.org/fedora-workstation-34-feature-focus-btrfs-transparent-compression/

I've found the experience to be excellent (I'm actively saving ~100GB on my near-full 500GB SSD from compression) and I imagine the performance may be better in some cases as CPU time is nearly "free" nowadays.

Written on 07 February 2022.
« Checking out a Git branch further back than the head
Some SSD write volumes from my machines »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Feb 7 23:36:27 2022
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.