ZFS quietly discards all-zero blocks, but only sometimes
On the ZFS on Linux mailing list, a question came up about whether
ZFS discards writes of all-zero blocks (as you'd get from '
if=/dev/zero of=...'), turning them into holes in your files or,
especially, holes in your zvols. This is especially relevant for
zvols, because if ZFS behaves this way it provides you with a way
of returning a zvol to a sparse state from inside a virtual machine
(or other environment using the zvol):
$ dd if=/dev/zero of=fillfile [... wait for the disk to fill up ...] $ rm -f fillfile
The answer turns out to be that ZFS does discard all-zero blocks
and turn them into holes, but only if you have some sort of compression
turned on (ie, that you don't have the default '
This isn't implemented as part of ZFS ZLE compression (or other
compression methods); instead, it's an entirely separate check that
looks only for an all-zero block and returns a special marker if
that's what it has. As you'd expect, this check is done before ZFS
tries whatever main compression algorithm you set.
Interestingly, there is a special compression level called 'empty'
ZIO_COMPRESS_EMPTY) that only does this special 'discard
zeros' check. You can't set it from user level with something like
compression=empty', but it's used internally in the ZFS code for
a few things. For instance, if you turn off metadata compression
zfs_mdcomp_disable tunable, metadata is still compressed
with this 'empty' compression. Comments in the current ZFS on Linux
source code suggest that ZFS relies on this to do things like discard
blocks in dnode object sets where all the
dnodes in the block are free (which apparently zeroes out the dnode).
There are two consequences of this. The first is that you should
always set at least ZLE compression on zvols, even if their
volblocksize is the same as your pool's
ashift block size and
so they can't otherwise benefit from compression (this would also apply to filesystems
if you set an
recordsize). The second is that it
reinforces how you should basically always turn compression on on
filesystems, even if you think you have mostly incompressible data.
Not only do you save space at the end of files, but you get to drop any all-zero
sections of sparse or pseudo-sparse files.
(Looking back, Richard Laager mentioned this zero block discarding for zvols back in a comment on this entry of mine, but apparently it didn't stick in my mind. Also, now I know the details.)
I took a quick look back through the history of ZFS's code, and as
far as I could see, this zero-block discarding has always been
there, right back to the beginnings of compression (which I believe
came in with ZFS itself).
ZIO_COMPRESS_EMPTY doesn't quite date
back that far; instead, it was introduced along with
zfs_mdcomp_disable, back in 2006.
(All of this is thanks to Gordan Bobic for raising the question in reply to me when I was confidently wrong, which led to me actually looking it up in the code.)
Comments on this page: