ZFS bookmarks and what they're good for
Regular old fashioned ZFS has filesystems and snapshots. Recent
versions of ZFS add a third object, called bookmarks. Bookmarks
are described like this in the
zfs manpage (for the '
Creates a bookmark of the given snapshot. Bookmarks mark the point in time when the snapshot was created, and can be used as the incremental source for a
ZFS on Linux has an additional explanation here:
A bookmark is like a snapshot, a read-only copy of a file system or volume. Bookmarks can be created extremely quickly, compared to snapshots, and they consume no additional space within the pool. Bookmarks can also have arbitrary names, much like snapshots.
Unlike snapshots, bookmarks can not be accessed through the filesystem in any way. From a storage standpoint a bookmark just provides a way to reference when a snapshot was created as a distinct object. [...]
The first question is why you would want bookmarks at all. Right
now bookmarks have one use, which is saving space on the source of
a stream of incremental backups. Suppose that you want to use
zfs receive to periodically update a backup. At one level,
this is no problem:
zfs snapshot pool/fs@current zfs send -Ri previous pool/fs@current | ...
The problem with this is that you have to keep the
snapshot around on the source filesystem,
pool/fs. If space is
tight and there is enough data changing on
pool/fs, this can be
annoying; it means, for example, that if people delete some files
to free up space for other people, they actually haven't done so
because the space is being held down by that snapshot.
The purpose of bookmarks is to allow you to do these incremental
sends without consuming extra space on the source filesystem. Instead
of having to keep the
previous snapshot around, you instead make
a bookmark based on it, delete the snapshot, and then do the
zfs send using the bookmark:
zfs snapshot pool/fs@current zfs send -i #previous pool/fs@current | ...
This is apparently not quite as fast as using a snapshot, but if you're using bookmarks here it's because the space saving is worth it, possibly in combination with not having to worry about unpredictable fluctuations in how much space a snapshot is holding down as the amount of churn in the filesystem varies.
(We have a few filesystems that get frequent snapshots for fast recovery of user-deleted files, and we live in a certain amount of concern that someday, someone will dump a bunch of data on the filesystem, wait just long enough for a scheduled snapshot to happen, and then either move the data elsewhere or delete it. Sorting that one out to actually get the space back would require deleting at least some snapshots.)
Using bookmarks does require you to keep the
on the destination (aka backup) filesystem, although the manpage
only tells you this by implication. I believe that this implies
that while you're receiving a new incremental, you may need extra
space over and above what the
current snapshot requires for space,
since you won't be able to delete
previous and recover its space
until the incremental receive finishes. The relevant bit from the
If an incremental stream is received, then the destination file system must already exist, and its most recent snapshot must match the incremental stream's source. [...]
This means that the destination filesystem must have a snapshot. This snapshot will and must match a bookmark made from it, since otherwise incremental send streams from bookmarks wouldn't work.
(In theory bookmarks could also be used to generate an imprecise
zfs diff' without having to keep the origin snapshot around. In
practice I doubt anyone is going to implement this, and why it's
necessarily imprecise requires an explanation of why and how bookmarks