How we guarantee there's always some free space in our ZFS pools
One of the things that we discovered fairly early on in our experience
with ZFS (I think within the lifetime of the first generation
Solaris fileservers) is that ZFS gets very
unhappy if you let a pool get completely full. The situation has
improved since then, but back in the days we couldn't even change
ZFS properties, much less remove files as
root. Being unable to
change properties is a serious issue for us because NFS exports
are controlled by ZFS properties, so if we had a full pool we
couldn't modify filesystem exports to cut off access from client
machines that were constantly filling up the filesystem.
(At one point we resorted to cutting off a machine at the firewall, which is a pretty drastic step. Going this far isn't necessary for machines that we run, but we also NFS export filesystems to machines that other trusted sysadmins run.)
To stop this from happening, we use pool-wide quotas. No matter
how much space people have purchased in a pool or even if this is a system pool
that we operate, we insist that it always have a minimum safety
margin, enforced through a '
quota=' setting on the root of the
pool. When people haven't purchased enough to use all of the pool's
current allocated capacity, this safety margin is implicitly the
space they haven't bought. Otherwise, we have two minimum margins.
The explicit minimum margin is that our scripts that manage pool
quotas always insist on a 10 MByte safety margin. The implicit
minimum margin is that we normally only set pool quotas in full GB,
so a pool can be left with several hundred MB of space between its
real maximum capacity and the nearest full GB.
All of this pushes the problem back one level, which is determining
what the pool's actual capacity is so we can know where this safety
margin is. This is relatively straightforward for us because all
of our pools use mirrored vdevs, which means that the size reported
zpool list' is a true value for the total usable space (people
with raidz vdevs are on their own here). However, we must reduce
this raw capacity a bit, because ZFS reserves 1/32nd of the pool
for its own internal use. We must reserve
at least 10 MB over and above this 1/32nd of the pool in order to
actually have a safety margin.
(All of this knowledge and math is embodied into a local script, so that we never have to do these calculations by hand or even remember the details.)
PS: These days in theory you can change ZFS properties and even remove files when your pool is what ZFS will report as 100% full. But you need to be sure that you really are freeing up space when you do this, not using more because of things like snapshots. Very bad things happen to your pool if it gets genuinely full right up to ZFS's internal redline (which is past what ZFS will normally let you unless you trick it); you will probably have to back it up, destroy it, and recreate it to fully recover.
(This entry was sparked by a question from a commentator on yesterday's entry on how big our fileserver environment is.)