Our plan for handling TRIM'ing our ZFS fileserver SSDs

April 1, 2019

The versions of ZFS that we're running on our fileservers (both the old and the new) don't support using TRIM commands on drives in ZFS pools. Support for TRIM has been in FreeBSD ZFS for a while, but it only just landed in the ZFS on Linux development version and it's not in Illumos. Given our general upgrade plans, we're also not likely to get TRIM support over the likely production lifetime of our current ZFS SSDs through upgrading the OS and ZFS versions later. So you might wonder what our plans are to deal with how SSD performance can decrease when they think they're all filled up, if you don't TRIM them or otherwise deallocate blocks every so often.

Honestly, the first part of our plan is to ignore the issue unless we see signs of performance problems. This is not ideal but it is the simplest approach. It's reasonably likely that our ZFS fileservers will be more limited by NFS and networking than by SSD performance, and as far as I understand things, nominally full SSDs mostly suffer from write performance issues, not read performance. Our current view (only somewhat informed by actual data) is that our read volume is significantly higher than our write volume. We certainly aren't currently planning any sort of routine preventative work here, and we wouldn't unless we saw problem signs.

If we do see problems signs and do need to clear SSDs, our plan is to do the obvious brute force thing in a ZFS setup with redundancy. Rather than try to TRIM SSDs in place, we'll entirely spare out a given SSD so that it has no live data on it, and then completely clear it, probably using Linux's blkdiscard. We might do this in place on a production fileserver, or we might go to the extra precaution of pulling the SSD out entirely, swapping in a freshly cleared one, and clearing the old SSD on a separate machine. Doing this swap has the twin advantages that we're not risking accidentally clearing the wrong SSD on the fileserver and we don't have to worry about the effects of an extra-long, extra-slow SATA command on the rest of the system and the other drives.

(This plan, such as it is, is not really new with our current generation Linux fileservers. We've had one OmniOS fileserver that used SSDs for a few special pools, and this was always our plan for dealing with any clear problems due to the SSDs slowing down due to being full up. We haven't had to use it, but then we haven't really gone looking for performance problems with its SSDs. They seem to still run fast enough after four or more years, and so far that's good enough for us.)

Comments on this page:

By Michael at 2019-04-02 04:17:34:

I have no idea if you're already doing this, but you can get some data on the relative read/write load on SSDs (at least the ones I'm using, which is a mix of Intel SSDs) through SMART data.

For example, here's what smartctl on Linux has to say about one SSD that's part of a ZFS pool in my workstation:

241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       139155
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       649302

The last number is the raw value of the attribute. In the case of this particular SSD, the reads to writes ratio is 649302:139155, or about 4.7:1, so roughly 1/6 of all I/O to this particular SSD is writes, which given my usage pattern is within the realm of reasonable.

This does of course depend on the firmware reporting these numbers accurately, but it should be a decent starting point.

By cks at 2019-04-02 12:52:30:

We're gathering SSD read and write volume information, but my feeling is that so far we just haven't been using them long enough and for enough of our filesystems to have good data. Before our new generation we only had a small amount of data on SSDs, and we've only been migrating significant amounts of data to our new generation since January (and we're less than half done so far). I'd like to have several months of usage with more or less all of our data before I start making any general conclusions (at a minimum; a full year would be better).

Written on 01 April 2019.
« Our likely ZFS fileserver upgrade plans (as of March 2019)
NVMe and an interesting technology change »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Mon Apr 1 22:35:23 2019
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.