The limits of what ZFS scrubs check
In the ZFS community, there is a widespread view that ZFS scrubs
are the equivalent of fsck
for ordinary filesystems and so check
for and find at least as many error conditions as fsck
does.
Unfortunately this view of ZFS scrubs is subtly misleading and can
lead you to expect them to do things that they simply don't.
The simple version of what a ZFS scrub does is that it verifies the checksum for every copy of every (active) block in the ZFS pool. It also explicitly verifies parity blocks for RAIDZ vdevs (which a normal error-free read does not). In the process of doing this verification, the scrub must walk the entire object tree of the pool from the top downwards, which has the side effect of more or less verifying this hierarchy; certainly if there's something like a directory entry that points to an invalid thing, you will get a checksum error somewhere in the process.
However, this is all that a ZFS scrub verifies. In particular, it
does not check the consistency and validity of metadata that isn't
necessary to walk the ZFS object tree. This includes things like
much of the inode data that is returned by stat()
calls, and also
internal structural information that is not necessary to walk the
tree. Such information is simply tacitly assumed to be correct if
its checksum verifies.
What this means at a broad level is that while a ZFS scrub guards
against on disk corruption of data that was correct when it was
written, it does not protect against internal corruption of data.
If RAM errors or ZFS bugs
cause corrupt data to be written, a ZFS scrub will not detect it
even though it may be obvious in, for example, a ls -l
. This
is not just a theoretical issue,
and has been encountered on multiple platforms.
(I also believe that ZFS scrubs don't try to do full consistency checks on ZFS's tracking of free disk blocks. I'm not sure if they even try to check that all in-use blocks are actually marked that way.)
This means that a ZFS scrub does somewhat different checks than a
traditional fsck
. Traditional fsck
can't verify block integrity
except indirectly, unlike scrubs, but fsck
does a lot of explicit
consistency checks of things like inode modes to make sure they're
sane and it does verify that the filesystem's idea of free space
is correct.
It would be possible to make ZFS scrubs do additional checks, and this may happen at some point. But it is not the state of affairs today, so today you can have a ZFS pool with corruption that never the less passes ZFS scrubs with no errors. In extreme cases, you may wind up with a pool that panics the system. You can do a certain amount of verification yourself, for example by writing a program that walks the entire filesystem to verify that there are no inodes with crazy modes. And if you make your backups with a conventional system that works through the filesystem (instead of with ZFS snapshot replication), your backups will do a certain amount of verification themselves just by walking the filesystem and trying to read all of the files (sooner or later).
|
|