How much space ZFS reserves in your pools varies across versions

October 9, 2015

Back in my entry on the difference in available pool space between zfs list and zpool list, I noted that one of the reasons the two differ is that ZFS reserves some amount of space internally. At the time I wrote that the code said it should be reserving 1/32nd of the pool size (and still allow some things down to 1/64th of the pool, like ZFS property changes) but our OmniOS fileservers seemed to be only reserving 1/64th of the space (and imposing a hard limit at that point). It turns out that this discrepancy has a simple explanation: ZFS has changed its behavior over time.

This change is Illumos issue 4951, 'ZFS administrative commands should use reserved space, not fail with ENOSPC', which landed in roughly July of 2014. When I wrote my original entry in late 2014 I looked at the latest Illumos source code at the time and so saw this change, but of course our ZFS fileservers were using a version of OmniOS that predated the change and so were using the old 1/64th of the pool hard limit.

The change has propagated into various Illumos distributions and other ZFS implementations at different points. In OmniOS it's in up to date versions of the r151012 and r151014 releases, but not in r151010 and earlier. In ZFS on Linux, it landed in the 0.6.5 release and was not in 0.6.4. In FreeBSD, this change is definitely in -current (and appears to have arrived very close to when it did in Illumos), but it postdates 10.0's release and I think arrived in 10.1.0.

This change has an important consequence: when you update across this change, your pools will effectively shrink, because you'll go from ZFS reserving 1/64th of their space to reserving 1/32nd of their space. If your pools have lots of space, well, this isn't a problem. If your pools have only some space, your users may notice it suddenly shrinking a certain amount (some of our pools will lose half their free space if we don't expand them). And if your pools are sufficiently close to full, they will instantly become over-full and you'll have to delete things to free up space (or expand the pool on the spot).

I believe that you can revert to the old 1/64th limit if you really want to, but unfortunately it's a global setting so you can't do it selectively for some pools while leaving others at the default 1/32nd limit. Thus, if you have to do this you might want to do so only temporarily in order to buy time while you clean up or expand pools.

(Of course, by now most people may have already dealt with this. We're a bit behind the times in terms of what OmniOS version we're using.)

Sidebar: My lesson learned here

The lesson I've learned from this is that I should probably stop reflexively reading code from the Illumos master repo and instead read the OmniOS code for the branch we're using. Going straight to the current 'master' version is a habit I got into in the OpenSolaris days, when there simply was no source tree that corresponded to the Solaris 10 update whatever that we were running. But these days that's no longer the case and I can read pretty much the real source code for what's running on our fileservers. And I should, just to avoid this sort of confusion.

(Perhaps going to the master source and then getting confused was a good thing in this case, since it's made me familiar with the new state of affairs too. But it won't always go so nicely.)

Written on 09 October 2015.
« Our low-rent approach to verifying that NFS mounts are there
Bad news about how we detect and recover from NFS server problems »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Oct 9 22:23:55 2015
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.