ZFS bookmarks and what they're good for

February 22, 2017

Regular old fashioned ZFS has filesystems and snapshots. Recent versions of ZFS add a third object, called bookmarks. Bookmarks are described like this in the zfs manpage (for the 'zfs bookmark' command):

Creates a bookmark of the given snapshot. Bookmarks mark the point in time when the snapshot was created, and can be used as the incremental source for a zfs send command.

ZFS on Linux has an additional explanation here:

A bookmark is like a snapshot, a read-only copy of a file system or volume. Bookmarks can be created extremely quickly, compared to snapshots, and they consume no additional space within the pool. Bookmarks can also have arbitrary names, much like snapshots.

Unlike snapshots, bookmarks can not be accessed through the filesystem in any way. From a storage standpoint a bookmark just provides a way to reference when a snapshot was created as a distinct object. [...]

The first question is why you would want bookmarks at all. Right now bookmarks have one use, which is saving space on the source of a stream of incremental backups. Suppose that you want to use zfs send and zfs receive to periodically update a backup. At one level, this is no problem:

zfs snapshot pool/fs@current
zfs send -Ri previous pool/fs@current | ...

The problem with this is that you have to keep the previous snapshot around on the source filesystem, pool/fs. If space is tight and there is enough data changing on pool/fs, this can be annoying; it means, for example, that if people delete some files to free up space for other people, they actually haven't done so because the space is being held down by that snapshot.

The purpose of bookmarks is to allow you to do these incremental sends without consuming extra space on the source filesystem. Instead of having to keep the previous snapshot around, you instead make a bookmark based on it, delete the snapshot, and then do the incremental zfs send using the bookmark:

zfs snapshot pool/fs@current
zfs send -i #previous pool/fs@current | ...

This is apparently not quite as fast as using a snapshot, but if you're using bookmarks here it's because the space saving is worth it, possibly in combination with not having to worry about unpredictable fluctuations in how much space a snapshot is holding down as the amount of churn in the filesystem varies.

(We have a few filesystems that get frequent snapshots for fast recovery of user-deleted files, and we live in a certain amount of concern that someday, someone will dump a bunch of data on the filesystem, wait just long enough for a scheduled snapshot to happen, and then either move the data elsewhere or delete it. Sorting that one out to actually get the space back would require deleting at least some snapshots.)

Using bookmarks does require you to keep the previous snapshot on the destination (aka backup) filesystem, although the manpage only tells you this by implication. I believe that this implies that while you're receiving a new incremental, you may need extra space over and above what the current snapshot requires for space, since you won't be able to delete previous and recover its space until the incremental receive finishes. The relevant bit from the manpage is:

If an incremental stream is received, then the destination file system must already exist, and its most recent snapshot must match the incremental stream's source. [...]

This means that the destination filesystem must have a snapshot. This snapshot will and must match a bookmark made from it, since otherwise incremental send streams from bookmarks wouldn't work.

(In theory bookmarks could also be used to generate an imprecise 'zfs diff' without having to keep the origin snapshot around. In practice I doubt anyone is going to implement this, and why it's necessarily imprecise requires an explanation of why and how bookmarks work.)


Comments on this page:

Interesting feature, from my quick scans of the docs when I was looking into them -- I was under the impression that it had more to do marking (and keeping) transaction groups alive so they can be used as a point-in-time marker.

By cks at 2017-02-23 12:42:26:

I don't think bookmarks keep transaction groups alive, just record the txg number and time. That's why they're only imperfect substitutes for snapshots (which do keep the txg alive).

By Michael Nino at 2017-06-30 14:49:18:

We often snapshot Staging when we refresh data from Production, creating a Bookmark allows us to track our data refreshes w/o needed actually consuming the storage space in that environment that doesn't need it.

By Mark Costlow <cheeks@swcp.com> at 2019-07-10 18:14:39:

After reading about bookmarks (a few times) I'm still not 100% clear on why they don't consume space the way a snapshot would. (Like how is the delta from the bookmark to a newer snapshot accurate if the bookmark hasn't preserved the state of the volume the way a snapshot would?) I'll keep reading on that.

The reason I'm commenting is to say I've found another use case for bookmarks. I want to do periodic backups using ZFS replication, as in your example. But the source filesystem is itself the target of other ZFS replications. In other words, client machine backs up to zfs1, and zfs1 periodically backs up to an encrypted offsite disk, with each of these backups executed via "zfs send/recv".

The problem is the act of creating a snapshot for the offsite copy causes the next replication from a client machine to fail because the most recent snapshot on the destination does not match the incremental source.

My answer (works in testing, working on deployment) is: after send/recv of the "offsite" snapshot to the offsite disk, make a bookmark from the offsite snapshot, then delete the offsite snapshot. When this disk rotates back in a few weeks later, I still have that bookmark on zfs1 that I can use as the origin to send a new incremental. In the meantime, the existence of that bookmark does NOT prevent further sends TO zfs1.

The space savings will be an extra bonus ... with several offsite disks in a rotation, the bookmarks may need to live for weeks but they won't prevent space from being reclaimed if/when newer snapshots are deleted.

Written on 22 February 2017.
« Sometimes it can be hard to tell one cause of failure from another
How ZFS bookmarks can work their magic with reasonable efficiency »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Feb 22 23:58:39 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.