July, 2010

How reliable your storage array is?

One of the issue particularly concerning with ultra-large hard drives is something called an unrecoverable bit error (UBE). Although rarely discussed with potential customers and definitely a topic "down in the weeds" UBE is a problem anyone working with ultra-large hard drives and RAID sets will eventually have to deal with. Why? Because as drives get bigger and bigger, RAID striping puts the end user at risk for having to read more blocks of data than the hard drive UBE can accommodate reliably and the rebuild from parity won't complete thus compromising the entire RAID set.

What is UBE? All hard drives have unrecoverable bit error ratings (UBER) which indicate how many bits a drive can likely transfer before encountering an UBE. Normally a drive can cope with bit errors and life moves on as usual. But when a drive encounters an unrecoverable bit error, bad things happen. SATA desktop drives have a UBER of 1 in 10^14 or 12.5TB. A 2TB drive equals roughly 1/6 of this UBER. A 5 drive RAID set would equal 5/6 of the UBER. Should a single drive fail in this RAID configuration there is an 83% probability that the RAID set will experience an UBE during RAID rebuild. Although these drives are not the standard in higher-end drive arrays, some are making their way into the datacenter in an effort to get prices lower.

Most array vendors will push "cheap and deep" near-line SATA or SAS drives with a UBER of 1 in 10^15 or 125TB. This increased UBER would mean that our 5 drive RAID set mentioned above would have an 8.3% probability of experiencing an UBE during rebuild thus preventing successful RAID data recovery. Even though the relative risk is lower, imagine telling your management (or regulator) that the corporate data has a 91.7% chance of surviving a single drive failure on that new, expensive RAID array you just convinced them to buy? Probably wouldn't have the effect you were looking for (unless, of course, you were wanting another job and then it might have just the effect you were looking for!)

What is the answer? For one thing, quit thinking that all hard drives are equal and the only thing that matters is the cost. They aren't and it isn't. Make sure the drives your vendor is quoting meet the criticality of data they will be storing. If you insist on using desktop class drives use RAID 10 instead of RAID 5 (although this added capacity overhead may negate the cost savings of the lower-end drives). Second, make sure your storage vendor subscribes to the ANSI T10 Data Integrity Field (DIF). DIF is a standard that provides end-to-end data integrity.

T10-DIF provides a logical block guard to compare the data actually written to the disk with what is supposed to be written, a logical block application tag to ensure that the data is written to the correct logical unit and a logical block reference tag to see that the data is written to the proper virtual block.

In other words, DIF first utilizes cyclic redundancy check (CRC) to make sure that raw data is mathematically the same between the source and destination. CRC has been used for years and as most people in this business know, if the binary sequence (CRC Code) calculated at the destination calculates differently than the source an error is deemed to have occurred and the data is usually written again. Where CRC falls short is that it only guarantees the data was written correctly, it doesn't guarantee it can ever find the data again. DIF addresses this weakness. DIF ensures that not only has data been accurately written through CRC, but that it has been written to the correct folder and to the correct file. DIF makes certain data will always be retrievable thus going a long way towards mitigating the UBE problems of hard drive data recovery technology.