Chris's Wiki :: blog/tech/FilesystemChecksumEffects Commentshttps://utcc.utoronto.ca/~cks/space/blog/tech/FilesystemChecksumEffects?atomcommentsDWiki2013-03-31T16:45:07ZRecent comments in Chris's Wiki :: blog/tech/FilesystemChecksumEffects.From 72.200.82.252 on /blog/tech/FilesystemChecksumEffectstag:CSpace:blog/tech/FilesystemChecksumEffects:75be0f04eb50fdeb2d1487c2cf40de0a582d5012From 72.200.82.252<div class="wikitext"><p>Two possible workarounds that might be developable: </p>
<p>1. A ddrescue-like tool where it would let you read the file, telling you where the blocks that failed checksum are </p>
<p>2. A version compare tool where, if a previous version of the file at the same path in ZFS, it lets you get at both and diff them, although CoW would probably break this unless the unreadable block was in a recent modification.</p>
</div>2013-03-31T16:45:07ZFrom 84.112.126.145 on /blog/tech/FilesystemChecksumEffectstag:CSpace:blog/tech/FilesystemChecksumEffects:684b9aecb7dbd0ed0d364ae5409646ca959cc9b2From 84.112.126.145<div class="wikitext"><p>There is zdb.</p>
</div>2013-03-28T19:20:18ZBy Chris Siebenmann on /blog/tech/FilesystemChecksumEffectstag:CSpace:blog/tech/FilesystemChecksumEffects:fed183935148acc810ce62517dc27e371a8d4e04Chris Siebenmann<div class="wikitext"><p>I'm looking at the situation when all redundancy and other measures are
lost (all copies of the data are corrupt, perhaps because they were
corrupted during the initial write, perhaps for other reasons; this has
been known to happen). Given the limited IO interfaces we have failing
on checksum errors in the normal case is reasonable but, as I mentioned,
I think there should be a way around this. That way you could have both
integrity and availability (and make whatever tradeoffs you need). Some
people would restore from backups. Some people would carefully piece
together what they could.</p>
</div>2013-03-28T05:05:42ZFrom 71.80.128.33 on /blog/tech/FilesystemChecksumEffectstag:CSpace:blog/tech/FilesystemChecksumEffects:aa0f015ffa8e416473d377b0441e6891646c17e4From 71.80.128.33<div class="wikitext"><p>I don't even know what you're talking about. Checksumming is almost always combined with some sort of redundancy, to enable recovering an undamaged copy. With ZFS it just transparently recovers the file data from other disks in the zRAID set. With optical media, it has error-correcting codes, and besides that, can just keep re-reading until it gets some different data (or the user cleans the disk or swaps to another drive). </p>
<p>And yes, no data IS better than bad data. A file system doesn't really have any guaranteed way to pass messages to the upper layers of an operating system so the user is sure to see them... If your file isn't missing entirely, you're sure to assume it's full and complete (and then assume it's safe to delete your old backups of said file), no matter how many lines of noise and warnings are being spewed out to the log files.</p>
</div>2013-03-28T03:58:28ZFrom 65.111.70.130 on /blog/tech/FilesystemChecksumEffectstag:CSpace:blog/tech/FilesystemChecksumEffects:44815dac05ea9eb5c910194e31c62903f43bd3d5From 65.111.70.130<div class="wikitext"><p>What about ZFS's multiple-copies feature, and auto-scrubbing if the checksum fails?</p>
<p>Wouldn't that be a good trade-off?</p>
<p>--
Goozbach</p>
</div>2013-03-28T03:35:57Z