I won't be trying out ZFS's new TRIM support for a while

April 5, 2019

ZFS on Linux's development version has just landed support for using TRIM commands on SSDs in order to keep their performance up as you write more data to them and the SSD thinks it's more and more full; you can see the commit here and there's more discussion in the pull request. This is an exciting development in general, and since ZoL 0.8.0 is in the release candidate stage at the moment, this TRIM support might even make its way into a full release in the not too distant future.

Normally, you might expect me to give this a try, as I have with other new things like sequential scrubs. I've tracked the ZoL development tree on my own machines for years basically without problems, and I definitely have fairly old pools on SSDs that could likely benefit from being TRIM'd. However, I haven't so much as touched the new TRIM support and probably won't for some time.

Some projects have a relatively unstable development tree where running it can routinely or periodically destabilize your environment and expose you to bugs. ZFS on Linux is not like this; historically the code that has landed in the development version has been quite stable and problem free. Code in the ZoL tree is almost always less 'in development' and more 'not in a release yet', partly because ZoL has solid development practices along with significant amounts of automated tests. As you can read in the 'how has this been tested?' section of the pull request, the TRIM code has been carefully exercised both through specific new tests and random invocation of TRIM through other tests.

All of this is true, but then there is the small fact that in practice, ZFS encryption is not ready yet despite having been in the ZoL development tree for some time. This isn't because ZFS encryption is bad code (or untested code); it's because ZFS encryption turns out to be complicated and to interact with lots of other things. The TRIM feature is probably less complicated than encryption, but it's not simple, there are plenty of potential corner cases, and life is complicated by potential issues in how real SSDs do or don't cope well with TRIM commands being issued in the way that ZoL will. Also, an errant TRIM operation inherently destroys some of your data, because that's what TRIM does.

All of this makes me feel that TRIM is inherently much more dangerous than the usual ZoL new feature, sufficiently dangerous that I don't feel confident enough to try it. This time around, I'm going to let other people do the experimentation and collect the arrows in their backs. I will probably only start using ZFS TRIM once it's in a released version and a number of people have used it for a while without explosions.

If you feel experimental despite this, I note that according to the current manpage an explicit 'zpool trim' can apparently be limited to a single disk. I would definitely suggest using it that way (on a pool with redundancy); TRIM a single disk, wait for the disk to settle and finish everything, and then scrub your pool to verify that nothing got damaged in your particular setup. This is definitely how I'm going to start with ZFS TRIM, when I eventually do.

(On my work machine, I'm still tracking the ZoL tree so I'm using a version with TRIM available; I'm just not enabling it. On my home machine, for various reasons, I've currently frozen my ZoL version at a point just before TRIM landed, just in case. I have to admit that stopping updating ZoL does make the usual kernel update dance an easier thing, especially since WireGuard has stopped updating so frequently.)

Written on 05 April 2019.
« It's always DNS (a story of our circular dependency)
A ZFS resilver can be almost as good as a scrub, but not quite »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Apr 5 21:19:02 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.