The paradox of ZFS ARC non-growth and ARC hit rates

May 13, 2023

We have one ZFS fileserver that sometimes spends quite a while (many hours) with a shrunken ARC size, one tens of gigabytes below its (shrunken) ARC target size. Despite that, its ARC hit rate is still really high. Well, actually, that's not surprising; that's kind of a paradox of ARC growth (for both actual size and target size). This is because the combination of two obvious things: the ARC only grows when it needs to, and a high ARC hit rate means that the ARC isn't seeing much need to grow. More specifically, for reads the ARC only grows when there is a read ARC miss. If your ARC target size is 90 GB, your current ARC size is 40 GB, and your ARC hit rate is 100%, it doesn't matter than you have 50 GB of spare RAM, because the ARC has pretty much nothing to put in it.

This means that your ARC growth rate will usually be correlated with your ARC miss rate, or rather your ARC miss volume (which unfortunately I don't think there are kstats for). The other thing ARC growth rate can be correlated with is with your write volume (because many writes go into the ARC on their way to disk, although I'm not certain all of them do). However, ARC growth from write volume can be a transient thing; if you write something and then delete it, ZFS will first put it in the ARC and then drop it from the ARC.

(Deleting large amounts of data that was in the ARC is one way to rapidly drop the ARC size. If your ARC size shrinks rapidly without the target size shrinking, this is probably what's happened. This data may have been recently written, or it might have been read and then deleted.)

This is in a sense both obvious and general. All disk caches only increase their size while reading if there are cache misses; if they don't have cache misses, nothing happens. ZFS is only unusual in that we worry and obsess over the size of the ARC and how it fluctuates, rather than assuming that it will all just work (for good reasons, especially on Linux, but even on Solaris and later Illumos, the ZFS ARC size was by default constrained to much less than the regular disk cache might have grown to without ZFS).

Written on 13 May 2023.
« The modern browser experience has some impressive subtle tricks
Why I use separate lexers in my recursive descent parsers »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat May 13 21:48:38 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.