An update to the ZFS excessive prefetching situation

August 24, 2008

A while back I wrote about how I had discovered that ZFS could wind up doing excessive readahead when faced with many streams of sequential read IO and wind up throwing 90% to 95% of the IO that it had done (with terrible consequences for application performance). It's time for an update on that situation.

First, for various reasons we wound up moving to Solaris server machines with 8 GB of memory (SunFire X2200s instead of X2100s), so I re-enabled ZFS file prefetching and re-ran my experiments. Initial testing was encouraging; with 8 GB, the ZFS ARC cache was big enough that even under my heavy test load ZFS could keep prefetched data around for long enough to not kill application level performance.

Well. Usually big enough, but sometimes the ZFS ARC would spontaneously decide to limit itself down to 2 GB (instead of the usual 5 to 7 GB), despite the test machines being otherwise unused and idle. This destroyed performance, and worse I could find no way of resetting the adaptive ARC target size (what you see as c in the output of 'kstat -m zfs') to recover from the situation. So we turned off ZFS file prefetching again and there things sat for a while.

Recently I discovered the under-documented zfs_arc_min ZFS tuning parameter, which sets the minimum ZFS ARC size (it is the mirror of the better documented zfs_arc_max tuning parameter for setting the maximum size). Since a large minimum size should prevent the catastrophic ARC shrinkage, our test systems now have it set to 5 GB and it seems to be working so far (in that the ARC hasn't shrunk on either of them).

(On dedicated NFS servers, I am pretty sure that we actively want most of the memory to be reserved for ZFS caches. Nothing that is particularly memory-consuming should ever run on them, and if it does, I would prefer that it swap itself to death rather than impacting NFS server performance.)

Update, October 22nd: see an important update. I can no longer recommend that you do this.

Written on 24 August 2008.
« Another problem with SSL identities
The concept of error distance in sysadmin commands »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Aug 24 00:39:47 2008
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.