ZFS on Linux and when you get stale NFSv3 mounts

March 9, 2023

Suppose that you have ZFS based NFS servers that you're changing from Ubuntu 18.04 to 22.04. These servers have a lot of NFS exported filesystems that are mounted and used by a lot of clients, so it would be very convenient if you could upgrade the ZFS fileservers without having to unmount and remount the filesystems on all of your clients. Conversely, if a particular way of moving from 18.04 to 22.04 is going to require you to unmount all of its filesystems, you'd like to know that in advance so you can prepare for it, rather than find out after the fact when clients start getting 'stale NFS handle' errors. Since we've just been through some experiences with this, I'm going to write down what we've observed.

There are at least three ways to move a ZFS fileserver from Ubuntu 18.04 to Ubuntu 22.04. I'll skip upgrading it in place because we don't have any experience with that; we upgrade machines by reinstalling them from scratch. That leaves two approaches for a ZFS server, which I will call a forklift upgrade and a migration. In a forklift upgrade, you build new system disks, then swap them in by exporting the ZFS pools, changing system disks, booting your new 22.04 system, and importing the pools back.

(As a version of the forklift upgrade you can reuse your current system disks, although this means you can't readily revert.)

Our experience with these in place 'export pools, swap system disks, import pools' forklift upgrades is that client NFSv3 mounts survive over them. Your NFS clients will stall while your ZFS NFS server goes away for a while, but once it's back (under the right host name and IP address), they resume their activities and things pick right back up where they were. We've also had no problems with ZFS pools when we reboot our servers with changed hostnames; changing the server's hostname doesn't cause ZFS on Linux to not bring the pools up on boot.

However, forklift upgrades can only be done on ZFS fileservers where you have separate system disks and ZFS pool disks. We have one fileserver where this isn't possible; it has only four disks and shares all of them between system filesystems and its ZFS pool. For this machine we did a migration, where we built a new version of the system using new disks on new hardware, then moved the ZFS data over with ZFS snapshots (as I thought we might have to). Once the data was migrated, we shut down the old server and made the new hardware take over the name, IP address, and so on.

Unfortunately for us, when we did this migration, NFS clients got stale NFS mounts. The new version of this fileserver had the same filesystem with the exact same contents (ZFS snapshots and snapshot replication insures that), the same exports, and so on, but the NFS filehandles came out different. It's possible that we could have worked around this if we had set an explicit 'fsid=' value in our NFS export for the filesystem (as per exports(5)), but it's also possible that there were other differences in the NFS filehandle.

(ZFS has a notion of a 'fsid' and a 'guid' for ZFS filesystems (okay, datasets), and zdb can in theory dump this information, but right now I can't work out how to go from a filesystem name in a pool to reading out its ZFS fsid, so I can't see if it's preserved over ZFS snapshot replication or if the receiver generates a new one.)

Comments on this page:

By Arnaud Gomes at 2023-03-10 03:44:57:

We solved this by explicitely setting fsid on the new export to the value the Linux kernel shos for he old one (somewhere in /proc/fs/nfsd I think, I don't have any NFS server in front of me to check right now).

The other non-obvious issue we ran into is that you can't move NFS exports one by one, on a given client you have to move all mounts from a given server at once, even if you use several different IP addresses server-side. We tried moving one IP address at a time, the client ended up confused. This was probably NFSv4 though, v3 may be different.

Written on 09 March 2023.
« Debconf's questions, or really whiptail, doesn't always work in xterms
Some bits on Linux NFS(v3) server filesystem IDs (and on filehandles) »

Page tools: View Source, View Normal, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Thu Mar 9 22:38:51 2023
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.