More on mismatched sectors on Linux software RAID mirrors
Some brief followups from my first entry on this.
First, the mismatch_cnt
numbers are reset from scratch every
time you re-run a check (and probably every time there is a mirror
resync). On many current systems, this means that they will be reset
every week. This makes sense and is even implied by the documentation (in the usual Unix
fashion of reading between the lines),
but it would have been nice to have it explicitly documented.
(I'm aware that I'm grumpy about this, but sysadmins really care about having clear and explicit documentation about what error messages mean. Sadly we rarely get either clear error messages or clear documentation about them.)
Second, I have seen the numbers go up and down from week to week, some times significantly, and I've even seen the problem go away for one of my software RAID devices (the smaller one, in both size and error counts) and then come back worse. I can't say that this makes me even more unhappy, because I was pretty unhappy from the start, but it does mean that whatever has started causing my problems with this is an ongoing problem, not a one-time event.
Unfortunately I have no practical alternative to software RAID in Linux at the current time. However, the urge to add some sort of real error logging to the kernel code for this is getting stronger and stronger.
(Please do not suggest hardware RAID; it isn't practical for various reasons, and I still believe that software RAID is better.)
Comments on this page:
|
|