Wandering Thoughts archives

2023-04-25

Understanding ZFS ARC hit (and miss) kstat statistics

The ZFS ARC exposes a number of kstat statistics about its hit and miss performance, which are obviously quite relevant for understanding if your ARC size and possibly its failure to grow are badly affecting you, or if your ARC hit rate is fine even with a smaller than expected ARC size. Complicating the picture are things like 'MFU hits' and 'MFU ghost hits', where it may not be clear how they relate to plain 'ARC hits'.

There are a number of different things that live in the ZFS ARC, each of which has its own size. Further, the disk blocks in the ARC (both 'data' and 'metadata') are divided between a Most Recently Used (MRU) portion and a Most Frequently Used (MFU) portion (I believe other things like headers aren't in either the MRU or MFU). As covered in eg ELI5: ZFS Caching, the MFU and MRU also have 'ghost' versions of themselves; to simplify, these track what would be in memory if the MFU (or MRU) portion used all of memory.

The MRU, MFU, and the ghost versions of themselves give us our first set of four hit statistics: 'mru_hits', 'mfu_hits', 'mru_ghost_hits', and 'mfu_ghost_hits'. These track blocks that were found in the real MRU or found in the real MFU, in which case they are actually in RAM, or found in the ghost MRU amd MFU, in which case they weren't in RAM but theoretically could have been. As covered in ELI5: ZFS Caching, ZFS tracks the hit rates of the ghost MRU and MFU as signs for when to change the balance between the size of the MRU and MFU. If a block wasn't even in the ghost MFU or MRU, there is no specific kstat for it and we have to deduce that from comparing MRU and MFU ghost hits with general misses.

However, what we really care about for ARC hits and misses is whether the block actually was in the ARC (in RAM) or whether it had to be read off disk. This is what the general 'hits' and 'misses' kstats track, and they do this independently of the MRU and MFU hits (and ghost 'hits'). At this level, all hits and misses can be broken down into one of four categories; demand data, demand metadata, prefetch data, and prefecth metadata (more on this breakdown is in my entry on ARC prefetch stats). Each of these four has hit and miss kstats associated with them, named things like 'demand_data_misses'. As far as I understand it, a 'prefetch' hit or miss means that ZFS was trying to prefetch something and either already found it in the ARC or didn't. A 'demand' read is from ZFS needing it right away.

(This implies that the same ZFS disk block can be a prefetch miss, which reads it into the ARC from disk, and then later a demand hit, when the prefetching paid off and the actual read found it in the ARC.)

In the latest development version of OpenZFS, which will eventually become 2.2, there is an additional category of 'iohits'. An 'iohit' happens when ZFS wants a disk block that already has active IO issued to read it into the ARC, perhaps because there is active prefetching on it. Like 'hits' and 'misses', this has the four demand vs prefetch and data vs metadata counters associated with it. I'm not quite sure how these iohits are counted in OpenZFS 2.1, and some of them may slip through the cracks depending on the exact properties associated with the read (although the change that introduced iohits suggests that they may previously have been counted as 'hits').

If you want to see how your ARC is doing, you want to look at the overall hits and misses. The MRU and MFU hits, especially the 'ghost' hits (which are really misses) strike me as less interesting. If you have ARC misses happening (which leads to actual read IO) and you want to know roughly why, you want to look at the breakdown of the demand vs prefetch and data vs metadata 'misses' kstats.

It's tempting to look at MRU and MFU ghost 'hits' as a percentage of misses, but I'm not sure this tells you much; it's certainly not very high on our fileservers. Somewhat to my surprise, the sum of MFU and MRU hits is just slightly under the overall number of ARC 'hits' on all of our fileservers (which use ZoL 2.1). However, they're exactly the same on my desktops, which run the development version of ZFS on Linux and so have an 'iohits'. So possibly in 2.1, you can infer the number of 'iohits' from the difference between overall hits and MRU + MFU hits.

(I evidently worked much of this out years ago since our ZFS ARC stats displays in our Grafana ZFS dashboards work this way, but I clearly didn't write it down back then. This time around, I'm fixing that for future me.)

solaris/ZFSUnderstandingARCHits written at 23:15:30; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.