== The better way to clear SMART disk complaints, with safety provided by ZFS A couple of months ago I wrote about [[clearing SMART complaints about one of my disks ClearingSMARTComplaints]] by very carefully overwriting sectors on it, and how ZFS made this kind of safe. In a comment, [[Christian Neukirchen http://chneukirchen.org/]] recommended using _hdparm --write-sector_ to overwrite sectors with read errors instead of the complicated dance with _dd_ that I used in [[my entry ClearingSMARTComplaints]]. As it happens, that disk coughed up a hairball of _smartd_ complaints today, so I got a chance to go through my procedures again and the advice is spot on. Using _hdparm_ makes things much simpler. So my revised steps are: .pn prewrap on # Scrub my ZFS pool in the hopes that this will make the problem go away. It didn't, which means that any read errors in [[the partition for the ZFS pool ZFSOnLinuxDiskSetup]] is in space that ZFS shouldn't be using. # Use _dd_ to read all of the ZFS partition. I did this with '_dd if=/dev/sdc7 of=/dev/null bs=512k conv=noerror iflag=direct_'. This hit several bad spots, each of which produced kernel errors that included a line like this: > blk_update_request: I/O error, dev sdc, sector 1748083315 # Use _hdparm --read-sector_ to verify that this is indeed the bad sector: > hdparm --read-sector 1748083315 /dev/sdc If this is the correct sector, _hdparm_ will report a read error and the kernel will log a failed SATA command. Note that is not a normal disk read, as _hdparm_ is issuing a low-level read, so you don't get a normal message; instead you get something like this: > ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > ata3.00: irq_stat 0x40000001 > ata3.00: failed command: READ SECTOR(S) EXT > ata3.00: cmd 24/00:01:73:a2:31/00:00:68:00:00/e0 tag 3 pio 512 in > res 51/40:00:73:a2:31/00:00:68:00:00/00 Emask 0x9 (media error) > [...] The important thing to notice here is that you don't get the sector reported (at least not in decoded form), so you have to rely on getting the sector number correct in the _hdparm_ command instead of being able to cross check it against earlier kernel logs. (Sector 1748083315 is 0x6831a273 in hex. All the bytes are there in the _cmd_ part of the message, but clearly shuffled around.) # Use _hdparm --write-sector_ to overwrite the sector, forcing it to be spared out: > hdparm --write-sector 1748083315 /dev/sdc (_hdparm_ will tell you what the hidden magic option you need is when you use _--write-sector_ without it.) # Scrub my ZFS pool again and then re-run the _dd_ to make sure that I got all of the problems. I was pretty sure I'd gotten everything even before the re-scrub and the re-_dd_ scan, because _smartd_ reported that there were no more currently unreadable (pending) sectors or offline uncorrectable sectors, both of which it had been complaining about before. This was a lot easier and more straightforward to go through than my previous procedure, partly because I can directly reuse the sector numbers from the kernel error messages without problems and partly because _hdparm_ does exactly what I want. There's probably a better way to scan the hard drive for read errors than _dd_. I'm a little bit nervous about my 512Kb block size here potentially hiding a second bad sector that's sufficiently close to the first, but especially with direct IO I think it's a tradeoff between speed and thoroughness. Possibly I should explore how well the _badblocks_ program works here, since it's the obvious candidate. (These days I force _dd_ to use direct IO when talking to disks because that way _dd_ does much less damage to the machine's overall performance.) (This is the kind of entry that I write because I just looked up [[my first entry ClearingSMARTComplaints]] for how to do it again, so clearly I'm pretty likely to wind up doing this a third time. I could just replace the drive, but at this point I don't have enough drive bay slots in my work machine's case to do this easily. Also, I'm a peculiar combination of stubborn and lazy where it comes to hardware.)