What makes a 'next generation' or 'advanced' modern filesystem, for me

January 4, 2015

Filesystems have been evolving in fits and starts for roughly as long as there have been filesystems, and I doubt that is going to stop any time soon. These days there are a number of directions that filesystems seem to be moving in, but I've come around to the view that one of them is of particular importance and is the defining characteristic of what I wind up calling 'modern', 'advanced', or 'next generation' filesystems.

By now, current filesystems have mostly solved the twin problems of performance and resilience in the face of crashes (although performance may need some re-solving in the face of SSDs, which change various calculations). Future filesystems will likely make incremental improvements, but I can't currently imagine anything drastically different.

Instead, the next generation frontier is in resilience to disk problems and improved recovery from them. At the heart of this is two things. First, a steadily increased awareness that when you write something to disk (either HD or SSD), you are not absolutely guaranteed to either get it back intact or get an error. Oh, the disk drive and everything involved will try hard, but there are a lot of things that can go wrong and especially over long amounts of time. Second, that the rate at which these problems happen has not really been going down over time. Instead they've actually been going up, because the most common models are based on a chance of error per so much data and the amount of data we store and use has kept going up and up.

The pragmatic result is that an increasing amount of people are starting to worry about quiet data loss, feel that the possibility of it goes up over time, and want to have some way to deal with it and fix things. It doesn't help that we're collectively storing more and more important things on disks (hopefully with backups, yes yes) instead of other media.

The dominant form that meeting this need is taking right now is checksums of everything on disk and filesystems that are aware of what's really happening in volume management. The former creates resilience (at least you can notice that something has gone wrong) and the latter aids recovery from it (since disk redundancy is one source of intact copies of the corrupted data, and a good idea anyways since whole disks can die).

(In this entry I'm talking only about local filesystems. There is a whole different evolutionary process going on in multi-node filesystems and multi-node object stores (that may or may not have a more or less POSIX filesystem layer on top of). And I'm not even going to think about various sorts of distributed databases that hold increasingly large amounts of data for large operations.)

PS: Part of my bias here is that resilience is what I've come to personally care about. One reason for this is that other filesystem attributes are pragmatically good enough and not subject to obvious inefficiencies and marvelous improvements (except for performance through SSDs), and another reason is that storage is now big enough and cheap enough that it's perfectly reasonable to store extra data (sometimes a lot of extra data, eg disk mirrors) to help insure that you can get your files back later.

Written on 04 January 2015.
« The effects of our fileserver multi-tenancy
Today on Linux, ZFS is your only real choice for an advanced filesystem »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jan 4 02:35:15 2015
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.