Wandering Thoughts archives


In ZFS, your filesystem layout needs to reflect some of your administrative structure

One of the issues we sometimes run into with ZFS is that ZFS essentially requires you to reflect your administrative structure for allocating and reserving space in how you lay out ZFS filesystems and filesystem hierarchies. This is because in ZFS, all space management is handled through the hierarchy of filesystems (and perhaps in having multiple pools). If you want to make two separate amounts of space available to two separate sets of filesystems (or collectively reserved by them), either they must be in different pools or they must be under different dataset hierarchies within the pool.

(These hierarchies don't have to be visible to users, because you can mount ZFS filesystems under whatever names you want, but they exist in the dataset hierarchy in the pool itself and you'll periodically need to know them, because some commands require the full dataset name and don't work when given the mount point.)

That sounds abstract, so let me make it concrete. Simplifying only slightly, our filesystems here are visible to people as /h/NNN (for home directories) and /w/NNN (workdirs, for everything else). They come from some NFS server and live in some ZFS pool there (inside little container filesystems), but the NFS server and to some extent the pool is an implementation detail. Each research group has its own ZFS pool (or for big ones, more than one pool because one pool can only be so big), as do some individual professors. However, there are not infrequently cases where a professor in a group pool would like to buy extra space that is only for their students, and also this professor has several different filesystems in the pool (often a mixture of /h/NNN homedir filesystems and /w/NNN workdir ones).

This is theoretically possible in ZFS, but in order to implement it ZFS would force us to put all of a professor's filesystems under a sub-hierarchy in the pool. Instead of the current tank/h/100 and tank/w/200, they would have to be something like tank/prof/h/100 and tank/prof/w/200. The ZFS dataset structure is required to reflect the administrative structure of how people buy space. One of the corollaries of this is that you can basically only have a single administrative structure for how you allocate space, because a dataset can only be in one place in the ZFS hierarchy.

(So if two professors want to buy space separately for their filesystems but there's a filesystem shared between them (and they each want it to share in their space increase), you have a problem.)

If there were sub-groups of people who wanted to buy space collectively, we'd need an even more complicated dataset structure. Such sub-groups are not necessarily decided in advance, so we can't set up such a hierarchy when the filesystems are created; we'd likely wind up having to periodically modify the dataset hierarchy. Fortunately the manpages suggest that 'zfs rename' can be done without disrupting service to the filesystem, provided that the mountpoint doesn't change (which it wouldn't, since we force those to the /h/NNN and /w/NNN forms).

While our situation is relatively specific to how we sell space, people operating ZFS can run into the same sort of situation any time they want to allocate or control collective space usage among a group of filesystems. There are plenty of places where you might have projects that get so much space but want multiple filesystems, or groups (and subgroups) that should be given specific allocations or reservations.

PS: One reason not to expose these administrative groupings to users is that they can change. If you expose the administrative grouping in the user visible filesystem name and where a filesystem belongs shifts, everyone gets to change the name they use for it.

solaris/ZFSAdminVsFilesystemLayout written at 22:58:55; Add Comment

Page tools: See As Normal.
Login: Password:
Atom Syndication: Recent Pages, Recent Comments.

This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.