A brief history of fsck

May 2, 2010

In response to my entry on turning off automatic ext3 fscks, Matty wrote an entry where he asked:

[...] On Unix based systems (and even in Windows), fsck (or chkdisk) only runs when the kernel notices that a file system is in some sort of inconsistent state. So then I ask, why did the Linux community decide to run fsck on file systems in consistent state?

This is a good prompt for a brief history of fsck, because the situation is more complicated than this.

In the early days of Unix, there was no fsck; instead there were separate programs that each checked one aspect of filesystem consistency. They were not run automatically; you ran them by hand when you thought you might have problems. Fsck itself first appeared in an addendum to V7 Unix, merging most of the functionality of the current checking programs together. Even then, it was a purely interactive program with no provisions for automatic, boot-time operation.

Running fsck automatically on boot first appeared in 4.0 BSD, which also added the -p ('preen') option to run fsck silently and on several disks at once. However, neither 4.0 BSD nor any previous system had any notion of 'clean' or 'consistent' filesystems, so fsck checked all filesystems at boot, on every boot.

As Unix systems got bigger this was increasingly undesirable, since checking all filesystems on boot kept taking longer and longer. The first solution (introduced almost immediately by 4.1c BSD, if not earlier) was to add a 'fastboot' mode where the system startup scripts skipped running fsck if a special flag file was present. Fsck itself would still check all filesystems no matter what if it actually got run.

Later, various Unixes adopted a more general approach: they changed the kernel to explicitly mark filesystems as cleanly unmounted and then modified fsck -p to skip such filesystems by default. Now all normal reboots would be fast, but if the system crashed you would still fsck all mounted filesystems. The final version of this was journaling filesystems and explicit 'damaged filesystem' markers in filesystem superblocks that are set when the kernel detects filesystem problems; then even after an unclean shutdown the system merely replays journals and fsck only checks things that have been explicitly marked as damaged.

But if you take the old time Unix perspective, this final state is a little bit unnerving; there is now no routine periodic check of your entire filesystem structure, and it should be noted that there are some errors that only fsck can find (such as data blocks that are claimed by more than one file). So the Linux filesystem people added support for periodic fscks of even healthy ext2 and ext3 filesystems, just in case. That way, even otherwise undetectable corruption would get discovered sooner or later.

(Yes, corruption isn't supposed to happen. It can anyways, and for several sorts of reasons.)

Written on 02 May 2010.
« A rule of thumb: Automate where you can make mistakes
Keeping track of filesystem consistency »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun May 2 01:26:50 2010
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.