The difference in available pool space between
zfs list and
For a while I've noticed that '
zpool list' would report that our pools
had more available space than '
zfs list' did and I've vaguely wondered
about why. We recently had a very serious issue due to a pool filling
up, so suddenly I became very interested in the whole issue and did
some digging. It turns out that there are two sources of the difference
depending on how your vdevs are set up.
For raidz vdevs, the simple version is that '
zpool list' reports more
or less the raw disk space before the raidz overhead while '
applies the standard estimate that you expect (ie that N disks worth of
space will vanish for a raidz level of N). Given that raidz overhead is
variable in ZFS, it's easy to see why the two commands are behaving this
In addition, in general ZFS reserves a certain amount of pool space for various reasons, for example so that you can remove files even when the pool is 'full' (since ZFS is a copy on write system, removing files requires some new space to record the changes). This space is sometimes called 'slop space'. According to the code this reservation is 1/32nd of the pool's size. In my actual experimentation on our OmniOS fileservers this appears to be roughly 1/64th of the pool and definitely not 1/32nd of it, and I don't know why we're seeing this difference.
(I found out all of this from a Ben Rockwood blog entry and then found the code in the current Illumos codebase to see what the current state was (or is).)
The actual situation with what operations can (or should) use what space
is complicated. Roughly speaking, user level writes and ZFS operations
zfs create' and '
zfs snapshot' that make things should use the
1/32nd reserved space figure, file removes and 'neutral' ZFS operations
should be allowed to use half of the slop space (running the pool down
to 1/64th of its size), and some operations (like '
zfs destroy') have
no limit whatever and can theoretically run your pool permanently and
unrecoverably out of space.
The final authority is the Illumos kernel code and its comments. These
days it's on Github so I can just link to the two most relevant bits:
spa_misc.c's discussion of
and dsl_synctask.h's discussion of
(What I'm seeing with our pools would make sense if everything was actually being classified as a 'allowed to use half of the slop space' operation. I haven't traced the Illumos kernel code at this level so I have no idea how this could be happening; the comments certainly suggest that it isn't supposed to be.)
(This is the kind of thing that I write down so I can find it later, even though it's theoretically out there on the Internet already. Re-finding things on the Internet can be a hard problem.)