Some ballpark numbers for fun on filling filesystem cache with NFS traffic

January 6, 2024

Our current ZFS fileserver hardware is getting long in the tooth, so we're working on moving to new hardware (with the same software and operational setup, which we're happy with). This new hardware has 512 GB of RAM instead of the 192 GB of RAM in our current fileservers, which means that we're going to have a very big ZFS filesystem cache. Today, I was idly wondering how long it would take to fill the cache to a reasonable level with NFS (read) traffic, since we have metrics that include, among other things, the typical bandwidth our current fileservers see (which usually isn't all that high).

ZFS doesn't use all of the system's memory for its ARC cache, and not all of the ARC cache is file data; some of it is filesystem metadata like the contents of directories, the ZFS equivalent of inodes, and so on. As a ballpark, I'll use 256 GBytes of file data in the cache. A single server with a 1G connection can read over NFS at about 110 Mbytes a second. This is a GByte read in just under ten seconds, or about 6.4 GBytes a minute, and a bit under 46 minutes of continuous full-rate 1G NFS reads to fill a 256 GByte cache (assuming that the ZFS fileserver puts everything read in the cache and there are no re-reads, which are some big assumptions).

Based on what I've seen on our dashboards, a reasonable high NFS read rate from a fileserver is in the area of 300 to 400 Mbytes a second. This is about 23.4 GBytes a minute (at 400 Mbytes/sec), and would fill the ZFS fileserver cache from a cold start in about 11 minutes (again with the assumptions from above). 400 Mbytes/sec is well within the capabilities of SSD-based fileservers.

However, most of the time our fileservers are much less active than that. Last Thursday, the average bandwidth over the workday was in the area of 1 Mbyte/sec (yes, Mbyte not GByte). At this rate filling a 256 GByte cache of file data would take three days. A 20 Mbyte/sec sustained read rate fills the cache in only a few hours. At the low end, relatively 'small' changes in absolute value clearly have an outsized effect on the cache fill time.

In practice, this cache fill requires 256 GBytes of different data that people want to read (possibly in a hurry). This is much more likely to be the practical limit on filling our fileserver caches, as we can see by the typical 1 Mbyte/sec data rate.

(All of this is actually faster than I expected before I started writing this and ran the numbers.)

Written on 06 January 2024.
« Having a virtual machine host server has been quite useful
TLS certificate expiry times are fundamentally a hack »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sat Jan 6 21:31:39 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.