Wandering Thoughts archives

2023-04-16

Some important ARC memory statistics exposed by ZFS on Linux (as of ZoL 2.1)

The ZFS ARC is ZFS's version of a disk cache, and ZFS on Linux reports various information about it in /proc/spl/kstat/zfs/arcstats. Some of this is information on how big the ZFS ARC is and wants to be, but other parts contain important information on how ZFS views the system's overall memory situation. The general meaning of this information is system independent (I believe it exists on FreeBSD and Illumons, as well as ZFS on Linux), but how it's determined and derived is system specific and I've only looked into the situation on Linux.

As covered, the critical ARC size parameter for determining if it will grow, shrink, or stay the same size is 'c', also known as 'arc_c', which is what the ARC considers the overall target size. ZFS also exposes three memory sizes, memory_all_bytes, memory_free_bytes, and memory_available_bytes. The 'all' number is how much total RAM ZFS thinks the system has; the 'free' number is how much memory ZFS thinks is free in general, and 'available' is how much memory ZFS feels it has available to it at the moment, which can go negative. If the 'available' number goes negative, the ARC shrinks; if it's (enough) positive, the ARC can grow.

On Linux, the code that determines these is in arc_os.c. On most Linux systems, the 'free' is the number of free pages plus the number of inactive file pages, which are visible in /proc/vmstat as nr_free_pages and nr_inactive_file. On all Linux systems, the 'available' number is the 'free' number minus 'arc_sys_free', which is normally somewhat over 1/32nd of your total RAM and doesn't get adjusted on the fly by ZFS. You can set this through the zfs_arc_sys_free parameter.

(The manual page says that arc_sys_free is normally 1/64th of RAM, but the actual code says 1/32nd plus stuff.)

Whether or not the ARC can grow at the moment is shown in 'arc_no_grow', which is 1 if the ARC can't grow at the moment. Generally, this will turn on and stay on if 'available' is less than 1/32nd of 'arc_c' (the 1/32nd bit is determined by 'arc_no_grow_shift', which is an internal variable and so not subject to tuning in ZFS on Linux). One implication of this is that it's harder and harder for the ARC target size to grow toward its maximum because you need more and more free memory as 'arc_c' gets larger and larger. On our ZFS fileservers with 192 GB of RAM we set the maximum ARC size to about 155 GB, so at the top end we need the 'free' memory number to reach over 10 GB. It looks like we have gotten there sometimes, but it doesn't happen very often.

(Most of our fileservers also spend 80% to 90% of their time with 'arc_no_grow' being 1.)

The situation for 'arc_no_grow' is checked once a second, so even without explicit memory pressure ARC growth will turn off when 'available' drops low enough; once 'arc_c' is large, this may be most of the time because of the minimum requirement above. If 'available' becomes negative (ie, if the 'free' memory drops below 'arc_sys_free'), then ZFS will consider there to be a 'memory pressure event' and ARC growth can't turn back on until at least zfs_arc_grow_retry seconds later, which defaults to five seconds. It's likely but not certain that this will trigger the ARC target size shrinking.

If 'arc_need_free' is non-zero, this means that ZFS on Linux is in the process of trying to shrink the ARC by (at least) that amount of bytes. This statistic is not used inside ZFS on Linux; it purely exposes some state information, and I think it can be zero even if the ARC is currently reclaiming memory.

Sidebar: The ARC's target size versus its actual size

It's entirely possible for the ARC to drop its memory usage without dropping its target size (for example, if you delete a big file that's been cached in the ARC, I think the ARC may drop the cached blocks for the file). Over the last week, our fileservers have had the target size be up to 40 GB more than the current size.

Differences the other way (when the target size is below the actual size) seem to be much smaller. Even going back four weeks, the largest shortfall is only a little bit over a GB. The obvious guess is that ZFS seems to be quite prompt at shrinking the ARC along side shrinking its target size.

linux/ZFSOnLinuxARCMemoryStatistics written at 23:01:41; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.