Why filesystems need to be where data is checksummed

January 9, 2015

Allegedly (and I say this because I have not looked for primary sources) some existing Linux filesystems are adding metadata checksums and then excusing their lack of data checksums by saying that if applications care about data integrity the application will do the checksumming itself. Having metadata checksums is better than having nothing and adding data checksums to existing filesystems is likely difficult, but this does not excuse their views about who should do what with checksums.

There are at least two reasons why filesystems should do data checksums. The first is that data checksums exist not merely to tell applications (and ultimately the user) when data becomes corrupt, but also to do extremely important things like telling which side of a RAID mirror is the correct side. Applications definitely do not have access to low-level details of things like RAID data, but the filesystem is at least in the right general area to be asking the RAID system 'do you happen to have any other copies of this logical block?' or the like.

The second reason is that a great many programs would never be rewritten to verify checksums. Not only would this require a massive amount of coding, it would require a central standard so that applications can interoperate in generating and checking these checksums, finding them, and so on and so forth. On Unix, for example, this would need support not just from applications like Firefox, OpenOffice, and Apache but also common programs like grep, awk, perl, and gcc. The net result would be that a great deal of file IO on Unix would not be protected by checksums.

(Let's skip lightly over any desire to verify that executables and shared libraries are intact before you start executing code from them, because you just can't do that without the kernel being very closely involved.)

When you are looking at a core service that should touch absolutely everything that does some common set of operations, the right place to put this service is in a central place so that it's implemented once and then used by everyone. The central place here is the kernel (where all IO passes through one spot), which in practice means in the filesystem.

(Perhaps this is already obvious to everyone; I'd certainly like to think that it is. But if there are filesystem developers out there who are seriously saying that data checksums are the job of applications instead of the filesystem, well, I don't know what to say. Note that I consider 'sorry, we can't feasibly add data checksums to our existing filesystem' to be a perfectly good reason for not doing so.)

Written on 09 January 2015.
« ZFS should be your choice today if you need an advanced filesystem on Unix
Autoplaying anything is a terrible decision, doubly so for video »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri Jan 9 03:55:52 2015
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.