Wandering Thoughts archives

2023-04-14

The various sizes of the ZFS ARC (as of OpenZFS 2.1)

The ZFS ARC is ZFS's version of a disk cache. Further general information on it can be found in two highly recommended sources, Brendan Gregg's 2012 Activity of the ZFS ARC and Allan Jude's FOSDEM 2019 ELI5: ZFS Caching (also, via). ZFS exposes a lot of information about the state of the ARC through kstats, but there isn't much documentation about what a lot of them mean. Today we're going to talk about some of the kstats related to size of the ARC. I'll generally be using the Linux OpenZFS kstat names exposed in /proc/spl/kstat/zfs/arcstats.

The current ARC total size in bytes is size. The ARC is split into a Most Recently Used (MRU) portion and a Most Frequently Used (MFU) portion; the two sizes of these are mru_size and mfu_size. Note that the ARC may contain more than MRU and MFU data; it also holds other things, so size is not necessarily the same as the sum of mru_size and mfu_size.

The ARC caches both ZFS data (which includes not just file contents but also the data blocks of directories) and metadata (ZFS dnodes and other things). All space used by the ARC falls into one of a number of categories, which are accounted for in the following kstats:

data_size metadata_size
bonus_size dnode_size dbuf_size
hdr_size l2_hdr_size abd_chunk_waste_size

('abd' is short for 'ARC buffered data'. In Linux you can see kstats related to it in /proc/spl/kstat/zfs/abdstats.)

Generally data_size and metadata_size will be the largest two components of the ARC size; I believe they cover data actually read off disk, with the other sizes being ZFS in-RAM data structures that are still included in the ARC. The l2_hdr_size will be zero if you have no L2ARC. There is also an arc_meta_used kstat; this rolls up everything except data_size and abd_chunk_waste_size as one number that is basically 'metadata in some sense'. This combined number is important because it's limited by arc_meta_limit.

(There is also an arc_dnode_limit, which I believe effectively limits dnode_size specifically, although dnode_size can go substantially over it under some circumstances.)

When ZFS reads data from disk, in the normal configuration it stores it straight into the ARC in its on-disk form. This means that it may be compressed; even if you haven't turned on ZFS on disk compression for your data, ZFS uses it for metadata. The ARC has two additional sizes to reflect this; compressed_size is the size in RAM, and uncompressed_size is how much this would expand to if it was all uncompressed. There is also overhead_size, which, well, let's quote include/sys/arc_impl.h:

Number of bytes stored in all the arc_buf_t's. This is classified as "overhead" since this data is typically short-lived and will be evicted from the arc when it becomes unreferenced unless the zfs_keep_uncompressed_metadata or zfs_keep_uncompressed_level values have been set (see comment in dbuf.c for more information).

Things counted in overhead_size are not counted in the compressed and uncompressed size; they move back and forth in the code as their state changes. I believe that the compressed size plus the overhead size will generally be equal to data_size + metadata_size, ie both cover 'what is in RAM that has been pulled off disk', but in different forms.

Finally we get to the ARC's famous target size, the famous (or infamous) 'arc_c' or just 'c'. This is the target size of the ARC; if it is larger than size, the ARC will grow as you read (or write) things that aren't in it, and if it's smaller than size the ARC will shrink. The ARC's actual size can shrink for other reasons, but the target size shrinking is a slower and more involved thing to recover from.

In OpenZFS 2.1 and before, there is a second target size statistic, 'arc_p' or 'p' (in arcstats); this is apparently short for 'partition', and is the target size for the Most Recently Used (MRU) portion of the ARC. The target size for the MFU portion is 'c - p' and isn't explicitly put into kstats. How 'c' (and 'p') get changed is a complicated topic that is going in another entry.

(In the current development version of OpenZFS, there's a new and different approach to MFU/MRU balancing (via); this will likely be in OpenZFS 2.2, whenever that is released, and may appear in a system near you before then, depending. The new system is apparently better, but its kstats are more opaque.)

Appendix: The short form version

size Current ARC size in bytes. It is composed of
data_size + metadata_size + bonus_size + dnode_size + dbuf_size + hdr_size + l2_hdr_size + abd_chunk_waste_size
arc_meta_used All of size other than data_size + abd_chunk_waste_size; 'metadata' in a broad sense, as opposed to the narrow sense of metadata_size.
mru_size Size of the MRU portion of the ARC
mfu_size Size of the MFU portion of the ARC
arc_meta_limit Theoretical limit on arc_meta_used
arc_dnode_limit Theoretical limit on dnode_size
c aka arc_c The target for size
p aka arc_p The target for mru_size
c - p The target for mfu_size

I believe that generally the following holds:

compressed_size + overhead_size = data_size + metadata_size

In OpenZFS 2.1 and earlier, there is no explicit target for MRU data as separate from MRU metadata. In OpenZFS 2.2, there will be.

solaris/ZFSARCItsVariousSizes written at 23:37:55; Add Comment


Page tools: See As Normal.
Search:
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.