How you migrate ZFS filesystems matters

September 14, 2018

If you want to move a ZFS filesystem around from one host to another, you have two general approaches; you can use 'zfs send' and 'zfs receive', or you can use a user level copying tool such as rsync (or 'tar -cf | tar -xf', or any number of similar options). Until recently, I had considered these two approaches to be more or less equivalent apart from their convenience and speed (which generally tilted in favour of 'zfs send'). It turns out that this is not necessarily the case and there are situations where you will want one instead of the other.

We have had two generations of ZFS fileservers so far, the Solaris ones and the OmniOS ones. When we moved from the first generation to the second generation, we migrated filesystems across using 'zfs send', including the filesystem with my home directory in it (we did this for various reasons). Recently I discovered that some old things in my filesystem didn't have file type information in their directory entries. ZFS has been adding file type information to directories for a long time, but not quite as long as my home directory has been on ZFS.

This illustrates an important difference between the 'zfs send' approach and the rsync approach, which is that zfs send doesn't update or change at least some ZFS on-disk data structures, in the way that re-writing them from scratch from user level does. There are both positives and negatives to this, and a certain amount of rewriting does happen even in the 'zfs send' case (for example, all of the block pointers get changed, and ZFS will re-compress your data as applicable).

I knew that in theory you had to copy things at the user level if you wanted to make sure that your ZFS filesystem and everything in it was fully up to date with the latest ZFS features. But I didn't expect to hit a situation where it mattered in practice until, well, I did. Now I suspect that old files on our old filesystems may be partially missing a number of things, and I'm wondering how much of the various changes in 'zfs upgrade -v' apply even to old data.

(I'd run into this sort of general thing before when I looked into ext3 to ext4 conversion on Linux.)

With all that said, I doubt this will change our plans for migrating our ZFS filesystems in the future (to our third generation fileservers). ZFS sending and receiving is just too convenient, too fast and too reliable to give up. Rsync isn't bad, but it's not the same, and so we only use it when we have to (when we're moving only some of the people in a filesystem instead of all of them, for example).

PS: I was going to try to say something about what 'zfs send' did and didn't update, but having looked briefly at the code I've concluded that I need to do more research before running my keyboard off. In the mean time, you can read the OpenZFS wiki page on ZFS send and receive, which has plenty of juicy technical details.

PPS: Since eliminating all-zero blocks is a form of compression, you can turn zero-filled files into sparse files through a ZFS send/receive if the destination has compression enabled. As far as I know, genuine sparse files on the source will stay sparse through a ZFS send/receive even if they're sent to a destination with compression off.

Written on 14 September 2018.
« I don't like getters and setters and prefer direct field access
How to use uBlock Origin to block Javascript by default »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Sep 14 00:17:58 2018
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.