Why Unix needs a standard way to deal with the file durability problem

April 17, 2016

One of the reactions to my entry on Unix's file durability problem is the obvious pragmatic one. To wit, that this isn't really a big problem because you can just look up what you need to do in practice and do it (possibly with some debate over whether you still need to fsync() the containing directory to make new files truly durable or whether that's just superstition by now). I don't disagree with this pragmatic answer and it's certainly what you need to do today, but I think to stick to it is to not see why Unix as a whole should have some sort of agreed on standard for this.

An agreed on standard would help both programmers and kernel developers. On the side of user level programmers, it tells us not just what we need to do in order to achieve file durability today but also what we need to do in order to future-proof our code. A standard amounts to a promise that no sane future Unix setup will add an additional requirement for file durability. If our code is working right today on Solaris UFS or Linux ext2, it will keep working right tomorrow on Linux ext4 or Solaris ZFS. Without a standard, we can't be sure about this and in fact some programs have been burned by it in the past, when new filesystems added extra requirements like fsync()'ing directories under some circumstances.

(This doesn't mean that all future Unix setups will abide by this, of course. It just means that we can say 'your system is clearly broken, this is your problem and not a fault in our code, fix your system setup'. After all, even today people can completely disable file durability through configuration choices.)

On the side of kernel people and filesystem developers, it tells both parties how far a sensible filesystem can go; it becomes a 'this far and no further' marker for filesystem write optimization. Filesystem developers can reject proposed features that break the standard as 'it breaks the standard', and if they don't the overall kernel developers can. Filesystem development can entirely avoid both a race to the bottom and strained attempts to read the POSIX specifications so as to allow ever faster but more dangerous behavior (and also the ensuing arguments over just how one group of FS developers read POSIX).

The whole situation is exacerbated because POSIX and other standards have so relatively little to say on this. The people who create hyper-aggressive C optimizers are at least relying on a detailed and legalistically written C standard (even if almost no programs are fully conformant to it in practice), and so they can point users to chapter and verse on why their code is not standards conforming and so can be broken by the compiler. The filesystem people are not so much on shakier ground as on fuzzy ground, which results in much more confusion, disagreement, and arguing. It also makes it very hard for user level programmers to predict what future filesystems might require here, since they have so little to go from.

Written on 17 April 2016.
« Why I think Let's Encrypt won't be a threat to commercial CAs for now
Why your Apache should have mod_status configured somewhere »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Apr 17 01:45:33 2016
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.