2022-03-10
It would be nice if Linux had a count of disk errors in sysfs
The Linux kernel knows when disk IO errors happen and it will tell you about them (at sometimes copious length, see 1, 2, 3). But only in the kernel message log, in a format that can vary from disk type to disk type and changes over time. This makes it pretty hard to notice when disk errors happen somewhere out there in your fleet and to keep track of them over time.
Given that the Linux kernel knows about these errors, it would be nice if it kept a per-device count of various sorts of disk errors and exposed them in sysfs (you'd want to separate out hard and soft errors, and to track read errors, write errors, and errors for other operations). In theory you can probably use an eBPF program to gather this information even without the kernel's cooperation, but doing it correctly is not trivial and needs to hook into more than just the error reporting routine(s).
(For example, disks can be hot-swapped at the hardware level. If 'sdc' is removed and a new disk is added than inherits the 'sdc' name, you don't want it to also inherit the old sdc's error count. The kernel can get this completely right, but an eBPF program almost certainly has some subtle event races even if it's monitoring all of the relevant points.)
Having generic disk support for these sysfs error counts would hopefully lead to it being extended to working on the various forms of software disks, such as software RAID and device mapper 'disks'. These don't necessarily experience a hard error just because some underlying device did (for example, a software RAID 1 mirror can retry the read from another device).
It's tempting to ask for a similar error count for filesystems, but I'm not sure if that's useful or meaningful. And in most situations you can correlate a device (possibly a virtual one) to a specific filesystem, so you know that if device 'md1' experienced a disk read error, your root filesystem did too.
(Some Linux filesystems do have some reporting of errors in sysfs, for example ext4's sysfs entries include an error count and information about the first and most recent errors experienced. But it's not clear what sort of 'errors' these are about, and the information is apparently persisted in the filesystem superblock instead of being transient. You might want to monitor the error count, though.)