Chris's Wiki :: blog/solaris/ZFSPermanentErrorsMeaning Commentshttps://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSPermanentErrorsMeaning?atomcommentsDWiki2022-01-29T20:30:32ZRecent comments in Chris's Wiki :: blog/solaris/ZFSPermanentErrorsMeaning.By nireq on /blog/solaris/ZFSPermanentErrorsMeaningtag:CSpace:blog/solaris/ZFSPermanentErrorsMeaning:fbdc7fb9f31eb97fc0bd14573bf97b861c3a7a2dnireq<div class="wikitext"><p>COpying orginal post:</p>
<blockquote><p>On Jun 5, 2018, at 2:40 AM, Chris Boot via zfs-discuss <zfs-discuss at list.zfsonlinux.org> wrote:</p>
<p>Hi all,</p>
<p>I've recently started experimenting with ZFS on Linux and because
storage is my penchant / fetish I am of course doing complicated things
with it.</p>
<p>The issue I've run into now manifests itself like this:</p>
</blockquote>
<p>>>>> --- 8< --- <<<</p>
<blockquote><ol><li>zpool status -v
pool: tank</li>
</ol>
<p>state: ONLINE
status: One or more devices has experienced an error resulting in data</p>
<pre>
corruption. Applications may be affected.
</pre>
<p>action: Restore the file in question if possible. Otherwise restore the</p>
<pre>
entire pool from backup.
see: http://zfsonlinux.org/msg/ZFS-8000-8A
scan: scrub in progress since Mon Jun 4 21:10:08 2018
1.01T scanned out of 53.8T at 24.2M/s, 635h51m to go
0B repaired, 1.88% done
</pre>
<p>config:</p>
<pre>
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
wwn-0x5000cca2526ac904 ONLINE 0 0 0
wwn-0x5000cca2526af1c4 ONLINE 0 0 0
wwn-0x5000cca2526b193c ONLINE 0 0 0
wwn-0x5000cca2526b1c70 ONLINE 0 0 0
wwn-0x5000cca2568e01c8 ONLINE 0 0 0
wwn-0x5000cca2515682a4 ONLINE 0 0 0
wwn-0x5000cca251824748 ONLINE 0 0 0
wwn-0x5000cca25189bd6c ONLINE 0 0 0
wwn-0x5000cca251ac3978 ONLINE 0 0 0
wwn-0x5000cca251b09264 ONLINE 0 0 0
wwn-0x5000cca25253e154 ONLINE 0 0 0
wwn-0x5000cca25253e19c ONLINE 0 0 0
logs
mirror-1 ONLINE 0 0 0
wwn-0x5000cca04d4d7cbc-part1 ONLINE 0 0 0
wwn-0x5000cca04dac1e68-part1 ONLINE 0 0 0
cache
wwn-0x5000cca04d4d7cbc-part2 ONLINE 0 0 0
wwn-0x5000cca04dac1e68-part2 ONLINE 0 0 0
</pre>
<p>errors: Permanent errors have been detected in the following files:</p>
<pre>
<0x95>:<0x0>
</pre>
</blockquote>
<p>>>>> --- 8< --- <<<</p>
<blockquote>
<p>The slow speed of the scrub is mostly because I'm running a <em>lot</em> of IO
on this filesystem so that's not an immediate concern. The raidz2 disks
are 12x HUH721008AL5200 and the log/cache disks are 2x HUSMM1620ASS201 -
yes it's a big filesystem on fairly decent hardware.</p>
<p>Is there anything I can do to fix or clear this error from the
filesystem? How can I tell more about what the entity described by
"<0x95>:<0x0>" is?</p>
</blockquote>
<p>The first number is the dataset id (index) and the second is the object id.
For filesystems, the object id can be the same as the file's "inode" as shown
by "ls -i" But a few obect ids exist for all datasets. Object id 0 is the DMU dnode.</p>
<p>To see dataset ids and object ids, you can use zdb thusly:
$ zdb -dd zwimming
...
Dataset zwimming/vol1 [ZVOL], ID 260, cr_txg 59076, 12K, 2 objects</p>
<pre>
Object lvl iblk dblk dsize dnsize lsize %full type
0 6 128K 16K 11K 512 16K 6.25 DMU dnode
1 1 128K 8K 0 512 8K 0.00 zvol object
2 1 128K 512 0 512 512 100.00 zvol prop
</pre>
<p>...
Dataset zwimming [ZPL], ID 54, cr_txg 1, 3.22M, 46 objects</p>
<pre>
ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0, flags 0x0
</pre>
<pre>
Object lvl iblk dblk dsize dnsize lsize %full type
0 6 128K 16K 21K 512 96K 23.96 DMU dnode
-1 1 128K 512 0 512 512 100.00 ZFS user/group/project used
-2 1 128K 512 0 512 512 100.00 ZFS user/group/project used
-3 1 128K 512 0 512 512 100.00 ZFS user/group/project used
1 1 128K 512 1K 512 512 100.00 ZFS master node
2 1 128K 512 0 512 512 100.00 ZFS directory
3 1 128K 1K 1K 512 1K 100.00 ZFS plain file
4 1 128K 1.50K 1.50K 512 1.50K 100.00 ZFS plain file
</pre>
<p>In this example, there is a volume (dataset id = 260) and ZPL filesystem (dataset id=54)
The rows show the type of each object id.</p>
<p>Finally, the error buffer for "zpool status" contains information for two scan passes:
the current and previous scans. So it is possible to delete an object (eg file) and still
see it listed in the error buffer. It takes two scans to completely update the error buffer.
This is important if you go looking for a dataset+object tuple with zdb and don't find
anything...</p>
<pre>
-- richard
</pre>
</div>2022-01-29T20:30:32Z