Some ballpark numbers for fun on filling filesystem cache with NFS traffic

January 6, 2024

Our current ZFS fileserver hardware is getting long in the tooth, so we're working on moving to new hardware (with the same software and operational setup, which we're happy with). This new hardware has 512 GB of RAM instead of the 192 GB of RAM in our current fileservers, which means that we're going to have a very big ZFS filesystem cache. Today, I was idly wondering how long it would take to fill the cache to a reasonable level with NFS (read) traffic, since we have metrics that include, among other things, the typical bandwidth our current fileservers see (which usually isn't all that high).

ZFS doesn't use all of the system's memory for its ARC cache, and not all of the ARC cache is file data; some of it is filesystem metadata like the contents of directories, the ZFS equivalent of inodes, and so on. As a ballpark, I'll use 256 GBytes of file data in the cache. A single server with a 1G connection can read over NFS at about 110 Mbytes a second. This is a GByte read in just under ten seconds, or about 6.4 GBytes a minute, and a bit under 46 minutes of continuous full-rate 1G NFS reads to fill a 256 GByte cache (assuming that the ZFS fileserver puts everything read in the cache and there are no re-reads, which are some big assumptions).

Based on what I've seen on our dashboards, a reasonable high NFS read rate from a fileserver is in the area of 300 to 400 Mbytes a second. This is about 23.4 GBytes a minute (at 400 Mbytes/sec), and would fill the ZFS fileserver cache from a cold start in about 11 minutes (again with the assumptions from above). 400 Mbytes/sec is well within the capabilities of SSD-based fileservers.

However, most of the time our fileservers are much less active than that. Last Thursday, the average bandwidth over the workday was in the area of 1 Mbyte/sec (yes, Mbyte not GByte). At this rate filling a 256 GByte cache of file data would take three days. A 20 Mbyte/sec sustained read rate fills the cache in only a few hours. At the low end, relatively 'small' changes in absolute value clearly have an outsized effect on the cache fill time.

In practice, this cache fill requires 256 GBytes of different data that people want to read (possibly in a hurry). This is much more likely to be the practical limit on filling our fileserver caches, as we can see by the typical 1 Mbyte/sec data rate.

(All of this is actually faster than I expected before I started writing this and ran the numbers.)


Comments on this page:

By Adam at 2024-01-10 10:46:27:

Our current ZFS fileserver hardware is getting long in the tooth, so we're working on moving to new hardware (with the same software and operational setup, which we're happy with).

Time for a ZFSFileserverSetupIV article? Materialistic, I know. The 512 GiB RAM + the III setup makes for interesting guessing.

By cks at 2024-01-10 13:17:43:

More or less the only thing changing is in hardware, which is moving to a different set of SuperMicro hardware, SuperServer SYS-221H-TN24R, using the X13DEM dual-socket motherboard with Xeon Silver 4410Ys and dual 10G-T (and the mentioned 512 GB of RAM). We were only able to get these units with front panel disks, so two of the 24 bays are system disks and they have (or will have) 22 data disks. This is more data disks than the current hardware, so we're not trying to do a one to one replacement of the existing servers and migration is going to take some time.

Written on 06 January 2024.
« Having a virtual machine host server has been quite useful
TLS certificate expiry times are fundamentally a hack »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Sat Jan 6 21:31:39 2024
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.