Why Unix needs a standard way to deal with the file durability problem

April 17, 2016

One of the reactions to my entry on Unix's file durability problem is the obvious pragmatic one. To wit, that this isn't really a big problem because you can just look up what you need to do in practice and do it (possibly with some debate over whether you still need to fsync() the containing directory to make new files truly durable or whether that's just superstition by now). I don't disagree with this pragmatic answer and it's certainly what you need to do today, but I think to stick to it is to not see why Unix as a whole should have some sort of agreed on standard for this.

An agreed on standard would help both programmers and kernel developers. On the side of user level programmers, it tells us not just what we need to do in order to achieve file durability today but also what we need to do in order to future-proof our code. A standard amounts to a promise that no sane future Unix setup will add an additional requirement for file durability. If our code is working right today on Solaris UFS or Linux ext2, it will keep working right tomorrow on Linux ext4 or Solaris ZFS. Without a standard, we can't be sure about this and in fact some programs have been burned by it in the past, when new filesystems added extra requirements like fsync()'ing directories under some circumstances.

(This doesn't mean that all future Unix setups will abide by this, of course. It just means that we can say 'your system is clearly broken, this is your problem and not a fault in our code, fix your system setup'. After all, even today people can completely disable file durability through configuration choices.)

On the side of kernel people and filesystem developers, it tells both parties how far a sensible filesystem can go; it becomes a 'this far and no further' marker for filesystem write optimization. Filesystem developers can reject proposed features that break the standard as 'it breaks the standard', and if they don't the overall kernel developers can. Filesystem development can entirely avoid both a race to the bottom and strained attempts to read the POSIX specifications so as to allow ever faster but more dangerous behavior (and also the ensuing arguments over just how one group of FS developers read POSIX).

The whole situation is exacerbated because POSIX and other standards have so relatively little to say on this. The people who create hyper-aggressive C optimizers are at least relying on a detailed and legalistically written C standard (even if almost no programs are fully conformant to it in practice), and so they can point users to chapter and verse on why their code is not standards conforming and so can be broken by the compiler. The filesystem people are not so much on shakier ground as on fuzzy ground, which results in much more confusion, disagreement, and arguing. It also makes it very hard for user level programmers to predict what future filesystems might require here, since they have so little to go from.


Comments on this page:

By Anon at 2016-04-17 11:10:22:

Whether you need to sync on the directory depends on the filesystem: http://danluu.com/file-consistency/#filesystem-semantics .

To an extent it's too late - we've already got lots of filesystems which require the fsync-fusion-ha dance and you never know when you'll encounter one. Even if a new set of calls turns up all it takes is for you to access something over a network file share and you might have to fallback to doing the dance to keep your data safe.

It's worth noting you also can't make things weaker than what the dance requires and still be POSIX compliant. However, it's worth noting that a lot of new distributed filesystem-like things popping up aren't trying for POSIX compliance so there's a wrinkle there...

Assuming POSIX and and a disk which plays ball it is possible to test if you got things by analysing the syscalls being made relative to what you need to happen but it's not easy (see the ALICE discussion in https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdf ).

The LWN comments discussing the Dan Luu article cover some of your wishlist items: https://lwn.net/Articles/667788/ .

I would say my original point is that folks should use an up-to-date SQLite if they want someone else to take care of the problem today (as opposed to saying "people can just look it up")...

Written on 17 April 2016.
« Why I think Let's Encrypt won't be a threat to commercial CAs for now
Why your Apache should have mod_status configured somewhere »

Page tools: View Source, View Normal, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Apr 17 01:45:33 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.