RAID Level 5 Data Redundancy: Theory & Reality

READY TO GET STARTED?

You can ship your media to us or visit one of our offices.

GET STARTED NOW

November, 2013

RAID Level 5 data redundancy: theory and reality

In theory, RAID 5 protects your data. In reality, RAID 5 is often a painful failure. Why? Mean time-to-data-loss (MTTDL) is a fraud: actual rates of double-disk failures are 2 to 1500 times higher than MTTDL predicts.

What's behind MTTDL's failure?
In A Highly Accurate Method for Assessing Reliability of RAID researchers from the NetApp and the University of Maryland, compared RAID theory against actual field data. They found that MTTDL calculations inaccurate for 3 reasons:

Errors in statistical theory of repairable systems.
Incomplete consideration of failure modes.
Inaccurate time-to-failure distributions.

By repairing MTTDL's theoretical basis, adding real-world failure RAID data recovery and using Monte Carlo simulations they found that today's MTTDL estimates are wildly optimistic. Which means your data is a lot less safe with RAID 5 than you know.

Repairable systems.
The typical MTTDL assumption is that once repaired - i.e. a disk is replaced with a new one and automatic RAID data recovery completed - the RAID system is as good as new. But this isn't true: at best, the system is only slightly better than it was right before hard drive recovery.
One component is new - but the rest are as old and worn as they were before the failure - so the system is not "like new". The system is no less likely to fail after the repair than it was before.
The problem is that in RAID arrays repairs take time: the disk fails; a hot spare or a new disk is added; and the data recovery starts - a process that can take hours or days - while the other components continue to age.
MTTDL calculations use the wrong failure distributions and incorrectly correlate component and system failures.

Failure modes
MTTDL typically considers only catastrophic disk failures and disks have latent errors as well. A catastrophic failure + latent error is a dual-disk failure, something RAID 5 can't handle.

Using field-validated distributions for these 4 transition events and Monte Carlo simulations, the researchers concluded:
The model results show that including time-dependent failure rates and restoration rates along with latent defects yields estimates of dual-disk failures that are as much as 4,000 times greater than the MTTDL-based estimates.

This is why RAID 5 has caused so much trouble to so many people over the last 20 years.

Source:www.zdnet.com

Customer Average Rating 4.7 Hard Drive Recovery Rating

based on 114 Review

"Would not hesitate to recommend this company to anyone else."

ACE Data Recovery recovered all of my data, even though the drive was badly damaged. Throughout the entire process, Don was extremely professional and helpful. He also did a great job of keeping me up to date on where we stood in the recovery process. I will definitely use this company again.

Larysa Ferrell Rated Hard Drive Recovery Rating

5.0 on 03/16/2018

"They have an amazing friendly staff"

If you find yourself in a situation like my wife where her hard drive decided to take a dump and threaten to leave her with the loss of precious files, then I highly recommend you look up ACE Data Recovery.
These guys where straight and transparent about expectations and certainly did not take advantage of our situation and offered reasonable recovery rates for the work they were going to employ. Their communication throughout the whole process was fantastic with weekly updates on the progress of the recovery. Another great feature offered was the ability to break the payment into monthly payments while they worked on our drive. They have an amazing friendly staff Don, Alevtina and Diana, to name a few that certainly makes the difference and are deserving of a 5 star Rating

Charlie Rod Rated Hard Drive Recovery Rating

5.0 on 08/18/2021