The unfortunate limitation in ZFS filesystem quotas and refquota

June 30, 2020

When ZFS was new, the only option it had for filesystems quotas was the quota property, which I had an issue with and which caused us practical problems in our first generation of ZFS fileservers because it covered the space used by snapshots as well as the regular user accessible filesystem. Later ZFS introduced the refquota property, which did not have that problem but in exchange doesn't apply to any descendant datasets (regardless of whether they're snapshots or regular filesystems). At one level this issue with refquota is fine, because we put quotas on filesystems to limit their maximum size to what our backup system can comfortably handle. At another level, this issue impacts how we operate.

All of this stems from a fundamental lack in ZFS quotas, which is ZFS's general quota system doesn't let you limit space used only by unprivileged operations. Writing into a filesystem is a normal everyday thing that doesn't require any special administrative privileges, while making ZFS snapshots (and clones) requires special administrative privileges (either from being root or from having had them specifically delegated to you). But you can't tell them apart in a hierarchy, because ZFS only you offers the binary choice of ignoring all space used by descendants (regardless of how it occurs) or ignoring none of it, sweeping up specially privileged operations like creating snapshots with ordinary activities like writing files.

This limitation affects our pool space limits, because we use them for two different purposes; restricting people to only the space that they've purchased and insuring that pools always have a safety margin of space. Since pools contain many filesystems, we must limit their total space usage using the quota property. But that means that any snapshots we make for administrative purposes consume space that's been purchased, and if we make too many of them we'll run the pool out of space for completely artificial reasons. It would be better to be able to have two quotas, one for the space that the group has purchased (which would limit only regular filesystem activity) and one for our pool safety margin (which would limit snapshots too).

(This wouldn't completely solve the problem, though, since snapshots still consume space and if we made too many of them we'd run a pool that should have free space out of even its safety margin. But it would sometimes make things easier.)

PS: I thought this had more of an impact on our operations and the features we can reasonable offer to people, but the more I think about it the more it doesn't. Partly this is because we don't make much use of snapshots, though, for various reasons that sort of boil down to 'the natural state of disks is usually full'. But that's for another entry.


Comments on this page:

Do ZFS reservations have any practical use for the two purposes that you outline?

Written on 30 June 2020.
« How Prometheus Blackbox's TLS certificate metrics would have reacted to AddTrust's root expiry
In ZFS, your filesystem layout needs to reflect some of your administrative structure »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Tue Jun 30 22:17:27 2020
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.