Another update to the ZFS excessive prefetching situation
This is a quick and, unfortunately, overdue update. The last time
around, I wrote about how we had discovered
zfs_arc_min ZFS tuning parameter, so that we could set a minimum
ARC size so that it would not shrink too much.
Here is an important update: don't do this.
Since we set a minimum ARC size, we have had mysterious but ultimately
consistently reproduceable system failures under load. The consistent
way to kill our (then) lead NFS fileserver was that after at least a day
and half or so of uptime, doing a sequential write of enough data into
a ZFS pool would either cause the system to be unable to
sometimes, lock it up entirely. (We did not get a crash dump; by the I
had worked out how to do this, we had identified the
setting as the culprit.)
So apparently the ZFS ARC was shrinking and staying shrunk for a good reason after all, and stopping it from doing so can cause serious problems.
Our current decision is to run without ZFS tuning parameters at all, in the hopes that we won't see the excessive prefetching in real life. One reason we're willing to accept this risk is that ZFS prefetching can be turned off (and on) dynamically, so if we do run into the issue we can zap prefetching off until it goes away.
(If we start running into the issue routinely, we can even write a shell script to monitor the ARC size and enable or disable prefetching automatically.)