You can get 'stale filehandle' errors for local files on extN filesystems

May 26, 2011

Here's something interesting that we found out today (when another sysadmin here had it happen to him): it's possible to get 'stale filehandle' errors (ie, an ESTALE errno) when you access local files, under fairly obscure situations and if you're using the right filesystem. Specifically, if you're using an ext2, ext3, or ext4 filesystem, an inode that is corrupt in just the right way will do it; the corruption can happen either on disk or on the fly in the path from the disk to you.

You might wonder how a corrupt inode can result in a 'stale filehandle' error, and there lies a tale.

Suppose that some client has an NFS filehandle for a file (and thus an inode) that has since been deleted on the fileserver, and it tries to access that file. Obviously the NFS server needs to reject the access with an ESTALE result, which means that some part of the filesystem-specific code involved in turning a NFS filehandle into an inode needs to detect this and return some sort of error.

It turns out that the extN series of filesystems opts to do this detection not in code specific to NFS but instead in their generic 'get an inode from disk' code (in ext3, ext3_iget() in fs/ext3/inode.c). In theory this error path can only be triggered through the NFS server, since there's no way to access a file by its inode number from user level code, and so ESTALE is a perfectly appropriate error to return in this situation.

However, if the inode for a non-deleted file becomes sufficiently corrupt (either on the disk or in flight as it's read from the disk), this generic code will think that it is deleted and return an ESTALE error, and because it's generic code that's called for both local and remote accesses, this can result in 'stale filehandle' errors for a local file.

(I think that you can also get the same result if you have a directory get corrupted so that it still has entries for deleted files or has the wrong inode numbers for real files.)

Sidebar: the specifics

The situation changes slightly from ext2 to ext3 to ext4, but in all of them an inode with both a zero link count and a full inode mode of zero (which means that the inode has no information about what type of file it's for) will do it.

Written on 26 May 2011.
« More not supporting random query parameters in URLs
The stickyness of Fedora 8 (despite my better intentions) »

Page tools: View Source.
Search:
Login: Password:

Last modified: Thu May 26 22:25:15 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.