ZFS bookmarks and what they're good for

February 22, 2017

Regular old fashioned ZFS has filesystems and snapshots. Recent versions of ZFS add a third object, called bookmarks. Bookmarks are described like this in the zfs manpage (for the 'zfs bookmark' command):

Creates a bookmark of the given snapshot. Bookmarks mark the point in time when the snapshot was created, and can be used as the incremental source for a zfs send command.

ZFS on Linux has an additional explanation here:

A bookmark is like a snapshot, a read-only copy of a file system or volume. Bookmarks can be created extremely quickly, compared to snapshots, and they consume no additional space within the pool. Bookmarks can also have arbitrary names, much like snapshots.

Unlike snapshots, bookmarks can not be accessed through the filesystem in any way. From a storage standpoint a bookmark just provides a way to reference when a snapshot was created as a distinct object. [...]

The first question is why you would want bookmarks at all. Right now bookmarks have one use, which is saving space on the source of a stream of incremental backups. Suppose that you want to use zfs send and zfs receive to periodically update a backup. At one level, this is no problem:

zfs snapshot pool/fs@current
zfs send -Ri previous pool/fs@current | ...

The problem with this is that you have to keep the previous snapshot around on the source filesystem, pool/fs. If space is tight and there is enough data changing on pool/fs, this can be annoying; it means, for example, that if people delete some files to free up space for other people, they actually haven't done so because the space is being held down by that snapshot.

The purpose of bookmarks is to allow you to do these incremental sends without consuming extra space on the source filesystem. Instead of having to keep the previous snapshot around, you instead make a bookmark based on it, delete the snapshot, and then do the incremental zfs send using the bookmark:

zfs snapshot pool/fs@current
zfs send -i #previous pool/fs@current | ...

This is apparently not quite as fast as using a snapshot, but if you're using bookmarks here it's because the space saving is worth it, possibly in combination with not having to worry about unpredictable fluctuations in how much space a snapshot is holding down as the amount of churn in the filesystem varies.

(We have a few filesystems that get frequent snapshots for fast recovery of user-deleted files, and we live in a certain amount of concern that someday, someone will dump a bunch of data on the filesystem, wait just long enough for a scheduled snapshot to happen, and then either move the data elsewhere or delete it. Sorting that one out to actually get the space back would require deleting at least some snapshots.)

Using bookmarks does require you to keep the previous snapshot on the destination (aka backup) filesystem, although the manpage only tells you this by implication. I believe that this implies that while you're receiving a new incremental, you may need extra space over and above what the current snapshot requires for space, since you won't be able to delete previous and recover its space until the incremental receive finishes. The relevant bit from the manpage is:

If an incremental stream is received, then the destination file system must already exist, and its most recent snapshot must match the incremental stream's source. [...]

This means that the destination filesystem must have a snapshot. This snapshot will and must match a bookmark made from it, since otherwise incremental send streams from bookmarks wouldn't work.

(In theory bookmarks could also be used to generate an imprecise 'zfs diff' without having to keep the origin snapshot around. In practice I doubt anyone is going to implement this, and why it's necessarily imprecise requires an explanation of why and how bookmarks work.)

Written on 22 February 2017.
« Sometimes it can be hard to tell one cause of failure from another
How ZFS bookmarks can work their magic with reasonable efficiency »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Wed Feb 22 23:58:39 2017
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.