The challenge of storing file attributes on disk

July 14, 2018

In pretty much every Unix filesystem and many non-Unix ones, files (and more generally all filesystem objects) have a collection of various basic attributes, things like modification time, permissions, ownership, and so on, as well as additional attributes that the filesystem uses for internal purposes (eg). This means that every filesystem needs to figure out how to store and represent these attributes on disk (and to a lesser extent in memory). This presents two problems, an immediate one and a long term one.

The immediate problem is that different types of filesystem objects have different attributes that make sense for them. A plain file definitely needs a (byte) length that is stored on disk, but that doesn't make any sense to store on disk for things like FIFOs, Unix domain sockets, and even block and character devices, and it's not clear if a (byte) length still make sense for directories either given that they're often complex data structures today. There's also attributes that some non-file objects need that files don't; a classical example in Unix is st_rdev, the device ID of special files.

(Block and character devices may have byte lengths in stat() results but that's a different thing entirely than storing a byte length for them on disk. You probably don't want to pay any attention to the on-disk 'length' for them, partly so that you don't have to worry about updating it to reflect what you'll return in stat(). Non-linear directories definitely have a space usage, but that's usually reported in blocks; a size in bytes doesn't necessarily make much sense unless it's just 'block count times block size'.)

The usual answer for this is to punt. The filesystem will define an on-disk structure (an 'inode') that contains all of the fields that are considered essential, especially for plain files, and that's pretty much it. Objects that don't use some of the basic attributes still pay the space cost for them, and extra attributes you might want either get smuggled somewhere or usually just aren't present. Would you like attributes for how many live entries and how many empty entry slots are in a directory? You don't get it, because it would be too much overhead to have the attributes there for everything.

The long term problem is dealing with the evolution of your attributes. You may think that they're perfect now (or at least that you can't do better given your constraints), but if your filesystem lives for long enough, that will change. Generally, either you'll want to add new attributes or you'll want to change existing ones (for example, widening a timestamp from 32 bits to 64 bits). More rarely you may discover that existing attributes make no sense any more or aren't as useful as you thought.

If you thought ahead, the usual answer for this is to include unused extra space in your on-disk attribute structure and then slowly start using it for new attributes or extending existing ones. This works, at least for a while, but it has various drawbacks, including that because you only have limited space you'll have long arguments about what attributes are important enough to claim some of it. On the other hand, perhaps you should have long arguments over permanent changes to the data stored in the on-disk format and face strong pressures to do it only when you really have to.

As an obvious note, the reason that people turn to a fixed on-disk 'inode' structure is that it's generally the most space-efficient option and you're going to have a lot of these things sitting around. In most filesystems, most of them will be for regular files (which will outnumber directories and other things), and so there is a natural pressure to prioritize what regular files need at the possible expense of other things. It's not the only option, though; people have tried a lot of things.

I've talked about the on-disk format for filesystems here, but you face similar issues in any archive format (tar, ZIP, etc). Almost all of them have file 'attributes' or metadata beyond the name and the in-archive size, and they have to store and represent this somehow. Often archive formats face both issues; different types of things in the archive want different attributes, and what attributes and other metadata needs to be stored changes over time. There have been some creative solutions over the years.

Written on 14 July 2018.
« ZFS on Linux's sharenfs problem (of what you can and can't put in it)
Understanding ZFS System Attributes »

Page tools: View Source.
Search:
Login: Password:

Last modified: Sat Jul 14 00:28:48 2018
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.