Why df on an NFS-mounted ZFS filesystem can give odd results

May 15, 2009

Suppose that you have a ZFS pool with various filesystems, and while the filesystems have data in them, you're just using the pool as a container; the top level pool filesystem has no data itself. In this situation, a df of the pool on the Solaris host will show something like:

Filesystem  size  used  avail
tank01      300G   20K    40G

Translated, we can see that the pool itself is 300G, the pool's top level filesystem has nothing in it, and there's 40G of unused space left in the pool; the rest is taken up by sub-filesystems, snapshots, and so on.

However, if you NFS mount the pool itself on a client and do a df on the client, what you will see is rather different:

Filesystem  size  used  avail
/tank01      40G   20K    40G

Suddenly your pool size has, well, disappeared.

Filesystems with quotas will show equally odd df results on NFS clients. If the pool has enough space left that the filesystem's size is limited by its quota, you will see the correct (quota-based) values for everything. However, if the pool starts running out of overall space the (total) size of the (quota-limited) filesystems starts shrinking, sometimes dramatically. All of this can be very alarming and upsetting to users, especially if it leads them to think that they haven't got space that they've paid for.

It turns out that all of this is because of a fundamental limit in the NFS v3 protocol combined with a decision made by the ZFS code (or perhaps the overall Solaris NFS server code). Filesystem information is queried by the NFS v3 FSSTAT operation, but the structure it returns only contains information about the total filesystem size and the remaining available space; there is no explicit field for 'space used'.

(NFS v3 FSSTAT does draw a distinction between 'free space' and 'free space that can be allocated', so it can handle various sorts of overhead and reserved space.)

This creates a dilemma for ZFS: do you return accurate total size and space available, leading to potentially completely inaccurate used figures, or do you make the total size be the sum of the space used and the space available, so clients show a correct used figure? As we can see, Solaris has chosen the second option.

(Okay, there's a third option: you could return the correct total size and an available space figure that was total size minus the used space. I think this would be even crazier than the other options.)

Written on 15 May 2009.
« Fixing your system after hitting the RAID growth gotcha
Autoresponders in the modern email world »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri May 15 01:23:40 2009
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.