Chris's Wiki :: blog/sysadmin/SkippingRAID5 Commentshttps://utcc.utoronto.ca/~cks/space/blog/sysadmin/SkippingRAID5?atomcommentsDWiki2007-09-15T14:58:34ZRecent comments in Chris's Wiki :: blog/sysadmin/SkippingRAID5.From 124.170.16.103 on /blog/sysadmin/SkippingRAID5tag:CSpace:blog/sysadmin/SkippingRAID5:a5ff7a1672224adca68297d0eabe43c862b30d73From 124.170.16.103<div class="wikitext"><p>If you lose a disk in RAID5, even if you have a hot swap disk ready and activate it immediately, there's still a significant amount of time spent writing data to that new disk. The danger is losing a second disk while your repair disk is still being brought online. The repair time for RAID5 is much greater than for RAID1 because there's a lot more data to read.</p>
<p>If the disks are getting old, the probability of a second drive failure increases once the first drive has failed. And if your hot swap is just as old as the RAID5 drives, you may be replacing a failed old device with another which is just as old.</p>
<p>I'd want more redundancy - e.g. can recover from a 2-drive loss.</p>
</div>2007-09-15T14:58:34ZBy Chris Siebenmann on /blog/sysadmin/SkippingRAID5tag:CSpace:blog/sysadmin/SkippingRAID5:d61479aa97a26c245631937477bd303d4ced767eChris Siebenmann<div class="wikitext"><p>Both a hot spare and RAID 6 lose an extra disk to overhead. Also, hot
spare re-sync on a large RAID-5 SATA array is not exactly fast these
days, so you can be exposed for a not insignificant amount of time.</p>
<p>Read times faster than raw single disk IO are not exactly a priority
for this use; in fact, most of this data will never be read after it is
written, because most of the time we hope to never need our backups.
That we can read from this setup faster and more conveniently than going
to tape is good enough.</p>
<p>No system design is static. If our storage and incremental backup sizes
grow enough to invalidate this design (which would have to be a lot of
growth), we would have to re-evaluate this approach and perhaps switch
to something else. In the mean time, we can gain the benefits, including
more days of conveniently accessible online incremental backups in the
common case.</p>
</div>2007-09-15T04:28:05ZFrom 24.207.191.29 on /blog/sysadmin/SkippingRAID5tag:CSpace:blog/sysadmin/SkippingRAID5:ebc6c3231c9d900de419e04a00b006c8dd304ac7From 24.207.191.29<div class="wikitext"><p>What about Raid-6 (double parity) such as used by Network Appliance?</p>
</div>2007-09-15T02:23:34ZFrom 12.109.229.8 on /blog/sysadmin/SkippingRAID5tag:CSpace:blog/sysadmin/SkippingRAID5:9237cf3485ce72f3aa1e6e89950b9aef1125a3faFrom 12.109.229.8<div class="wikitext"><p>This is a rather short-sighted approach and assumes you will always be able to fit a complete backup onto one disk. Most decent software (and hardware) RAID solutions offer the option to have a hot spare drive. This will reduce your exposure to total loss to the time it takes to re sync your drive back to a spare. RAID 5 offers a much faster read time than your single drive implementations.</p>
</div>2007-09-13T21:23:41ZBy Chris Siebenmann on /blog/sysadmin/SkippingRAID5tag:CSpace:blog/sysadmin/SkippingRAID5:30d54a3b5d3fccaf1f91d7d5674b00fef439f8b8Chris Siebenmann<div class="wikitext"><p>RAID always has a trade-off of cost (or overhead) versus the amount of
protection you get. Picking RAID-5 over RAID-1 <em>should</em> mean in part
that you have decided that you cannot afford (or do not need) that much
protection and you are willing to be exposed to two-disk total data loss.</p>
<p>(It is not just a straight RAID-1 versus RAID-5 choice, either; you can
choose how many RAID-5 groups you split your disks into, trading off
overhead against how much data you will lose if you have two disks fail
in the same group and so on.)</p>
</div>2007-09-13T20:33:16ZFrom 12.158.220.130 on /blog/sysadmin/SkippingRAID5tag:CSpace:blog/sysadmin/SkippingRAID5:5b5e1ffe2046ad81e90474d869c9639268aa3164From 12.158.220.130<div class="wikitext"><p>RAID5 is a good thing. </p>
<p>A distinction that I've heard elsewhere applies here:</p>
<pre>
RAID is a *CONTINUITY* strategy, not a backup strategy.
</pre>
<p>RAID shrinks recovery time down to nothing in many situations. Backups do not.</p>
<p>Backups provide "deep sh*t" recovery.</p>
<p>The two purposes can overlap.</p>
</div>2007-09-13T17:18:41ZFrom 199.72.20.100 on /blog/sysadmin/SkippingRAID5tag:CSpace:blog/sysadmin/SkippingRAID5:0dc103559f19cdd265e0aa16ea099dd724e57e32From 199.72.20.100<div class="wikitext"><p>Are you forgetting integrated hot-swap RAID 5E and 5EE. If you can loose two drives before your data gives up the ghost, then you can configure proper notifications and replace your dead FRU without worry. Also, have you considered redundant arrays configured as software RAID 1? If you use a remote iSCSI device as your mirror drive you can keep a complete copy of your data colo'ed away from your burned out crust of a data center. RAID 5 isn't the problem, badly designed storage solutions are the problem.</p>
</div>2007-09-13T16:37:17ZFrom 124.170.73.31 on /blog/sysadmin/SkippingRAID5tag:CSpace:blog/sysadmin/SkippingRAID5:0bc90a5d416febf950620688e398bc455ce5e6aaFrom 124.170.73.31<div class="wikitext"><p>That's why you setup automatic email notification of RAID drive failure!</p>
<p>RAID5 is bad news; stay away from it. If you want to keep your data moderately reliably, use RAID1 and an incremental backup scheme (offsite, on-net, anything other than on hard disk on the same machine as the RAID).</p>
</div>2007-09-13T12:53:41ZFrom 121.72.82.232 on /blog/sysadmin/SkippingRAID5tag:CSpace:blog/sysadmin/SkippingRAID5:610224bceeba87dc02e2f14c0a469cb439913d0eFrom 121.72.82.232<div class="wikitext"><p>I had this problem on a client's server some years ago running Red Hat Linux. The RAID system was setup as drive mirroring. The first drive failed at some point, but the system kept working until the second drive failed. But by that stage the computer server failed completely (as it would). I was able to resurrect the system, but it left me skeptical of RAID systems - especially in systems where there is no on-site system administrator to regularly check the system.</p>
</div>2007-09-13T09:12:12Z