== How Linux software RAID is making me grumpy right now This weekend, one of my machines sent me email to report: > ((WARNING: mismatch_cnt is not 0 on /dev/md0)) \\ > ((WARNING: mismatch_cnt is not 0 on /dev/md3)) What this means (as opposed to what it says) is that a [[software RAID data scrub NewSoftwareRAIDFeatures]] has detected some number of inconsistencies between the mirrors for two of my software RAID devices. (I believe that the kernel also notices this under some other circumstances, but I can't follow the code well enough to be sure or tell what they are. (The ((mismatch_cnt)) it is talking about is the one found in _/sys/block/md~~N~~/md_. You can read the full discussion about it in [[Documentation/md.txt http://www.kernel.org/doc/Documentation/md.txt]].) Let me inventory the obvious failures here. * Fedora's ((raid-check)) script doesn't bother to tell you what ((mismatch_cnt)) is, apart from 'not zero'. Since this is both volatile (it's only in kernel memory so it gets reset on reboot) and a measure of how much inconsistency was found, sysadmins would kind of like to have it recorded for posterity. Speaking for myself, I would *really* like to know if my arrays are progressively getting more and more inconsistent every week, or if it seems to have happened once and then stopped. * the software RAID code does not log any messages when it detects inconsistencies. If you do not know to look at ((mismatch_cnt)) and naively just watch syslog or the kernel messages, you are out of luck. * worse, the software RAID code doesn't tell you where the errors are. What do they affect? You have no way of finding out short of duplicating the work yourself in order to actually find out the sector numbers. (I have [[read http://www.mail-archive.com/gnhlug-discuss@mail.gnhlug.org/msg27126.html]] of people who shut down the software RAID device, directly mount each side's filesystem read-only, and _diff -r_ them. People with LVM on software RAID are plain out of luck.) The lack of information about where the errors are is extremely bad, because there is no actual repair process for this problem. The software RAID 'repair' operation is not a repair, it is a resync; if there is an inconsistency, it picks one side of the mirror (somehow) and force-updates the other to match it. There is no certainty that it will pick the right one. Therefor, if this happens to you you are best off doing nothing until you can specifically identify what was damaged (if anything) and then either try to recover data from the other mirror or restore things from backups. I foresee a very long downtime with a live CD in my future. Or some kernel hacks. Or both. The final failure is what may have caused this inconsistency. According to Neil Brown (in a message quoted [[here http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=405919]]), under some circumstances the software RAID code can write inconsistent data to the two sides of the mirror because it allows the page to be changed between when it is written to one side and when it is written to the other. According to his message, this should be harmless because the newly-dirty page will be rewritten at some point. [[Other reports http://www.mail-archive.com/gnhlug-discuss@mail.gnhlug.org/msg27126.html]] suggest strongly that this is not the case and that the inconsistencies can persist in real files. I am frankly dumbfounded that any software RAID implementation allows inconsistent data to be written to different sides of its mirrors. It strikes me as an utterly basic correctness invariant that a RAID-1 pair is always in sync (apart from in-flight writes, etc etc) in the abscence of disk errors and abnormal shutdowns.