May 30, 2015

vSphere Storage Terminologies - RAID


RAID - Redundant Array of Independent Disks

"In spite of the technological wonder of hard disks, they do fail—and fail predictably,. RAID schemes address this by leveraging multiple disks together and using copies of data to support I/O until the drive can be replaced and the RAID protection can be rebuilt.

Each RAID configuration tends to have different performance characteristics and different capacity overhead impact."

The goal of RAID is to increase disk performance, disk redundancy or both. "The performance increase is a function of striping: data is spread across multiple disks to allow reads and writes to use all the disks' IO queues simultaneously.”

"It really is a technological miracle that magnetic disks work at all. What a disk does all day long is analogous to a pilot flying a 747 at 600 miles per hour 6 inches off the ground and reading pages in a book while doing it!"


RAID-0 (Striping with no parity) (Striping at the Block-level)
RAID-0: when speed of access is more important than block-level data redundancy.

Note: RAID-0 should really be thought of as AID-0 as there is no "R"edundancy.

  • Has a good performance profile
  • Space efficiency is 100%
  • Is a Single-Point-Of-Failure (SPOF)
A good reason for using RAID-0 (striping) is to improve performance. The performance increase comes from read and write operations across multiple drives. In addition, striping allows for high data transfer rates because there will be no parity calculations.

RAID-0 also makes available 100% of the disk space to the system. I.e. it does not reserve any part of the disk group for array management. The total usable space is the sum of all available space in the array set.

The hardware or software array management software is responsible for making the array look like a single virtual disk drive. Striping takes portions of multiple physical disk drives and combines them into one virtual disk drive that is presented to the application.

The disadvantage of Raid-0 is no redundancy. This RAID level offers no redundancy and no protection against drive failure. It has a higher aggregate risk than a single disk as the failure of any single disk affects the whole RAID group.

The loss of one physical disk drive will result in the loss of all the data on all the striped disk drives.
This RAID type is usually not appropriate for production vSphere use because of the availability profile.

"RAID 0 takes your block of data, splits it up into as many pieces as you have disks (2 disks → 2 pieces, 3 disks → 3 pieces) and then writes each piece of the data to a seperate disk.

This means that a single disk failure destroys the entire array (because you have Part 1 and Part 2, but no Part 3), but it provides very fast disk access."



With RAID-01, first divide the array set into two groups of disks. Stripe the first group, then mirror the first group to the second group.

RAID-0 can be coupled with RAID-1 to form RAID 0+1, which stripes data across pairs of mirrors.

"It creates two RAID 0 arrays, and then puts a RAID 1 over the top. This means that you can lose one disk from each set (A1, A2, A3, A4 or B1, B2, B3, B4)."
Instead of RAID-01, consider using RAID-10.


RAID 1 (Mirroring), 1+0, 0+1
RAID-1: When you need redundancy with a limited number of disks.

Note: RAID-1 is the only RAID level that supports data redundancy with less than three disks.

  • Is Fault-Tolerant
  • Space efficiency is 50%
  • Has a good performance profile
The primary reason for using mirroring is to provide a high level of availability or reliability. Mirroring provides data redundancy by recording multiple copies of the data on independent spindles.

In the event of a physical disk drive failure, the mirror on the failed disk drive becomes unavailable, but the system continues to operate using the unaffected mirror or mirrors.

Depending on the implementation RAID-1 can improve read performance. I.e. some implementations issue read requests to both disks, doubling read speeds. Others take additional time to compute the data integrity on every read, resulting in no performance increase. Others read from only one disk, again offering no read performance increase.

"The main limitation of using a RAID-1 mirrored structure is that mirroring uses twice as many disk drives to have multiple copies of the data. Doubling the number of drives essentially doubles the cost per Mbyte of storage space. Another limitation is that mirroring degrades write performance because the write will have to be done twice."

The total usable space is the size of one of the disks in the array set. If the array set is comprised of different sized disks, the total usable space is the size of he smallest disk in the set.

"These mirrored RAID levels offer high degrees of protection but at the cost of 50 percent loss of usable capacity. This is versus the raw aggregate capacity of the sum of the capacity of the drives. RAID 1 simply writes every I/O to two drives and can balance reads across both drives (because there are two copies).

This can be coupled with RAID 0 to form RAID 1+0 (or RAID 10), which mirrors a stripe set, or to form RAID 0+1, which stripes data across pairs of mirrors. This has the benefit of being able to withstand multiple drives failing, but only if the drives fail on different elements of a stripe on different mirrors, thus making RAID 1+0 more fault tolerant than RAID 0+1."


RAID-10: When you need both redundancy and speed.
RAID 1 can be coupled with RAID 0 to form RAID 1+0 (or RAID 10), which mirrors a stripe set.

  • Is also called RAID 1+0
  • Is also called a "stripe of mirrors"
  • It requires minimum of 4 disks
RAID-10 is a combination of RAID 1 and RAID 0. With RAID-10, first you mirror the disks (i.e. disk 1+2, 3 + 4, 5 + 6, etc.), then you stripe across the array.

Since the groups are mirrored, the array will survive the loss of any single disk for one or more of the disk groups.

To setup RAID-10:
  • group the disks in pairs
  • within each group, mirror the disks
  • across each group, stripe the data.
I.e. for a 4 disk RAID set, there are two groups (A and B) of two disks within each group.

Within each group, the data is mirrored, i.e. data is written to both disks in each group. The data on Disk 1 is exactly the same as on Disk 2. The data on Disk 3 is exactly the same as on Disk 4. The disks within each group are mirrored.

Data is striped at the group level. I.e. stripe 1 (block A) is written to group A, stripe 2 (block B) is written to group B, stripe 3 (block C) is written to group A, stripe 4 (block D) is written to group B, etc.

The total usable space is 50% of the sum of the total available space. If the array set is comprised of different sized disks, the total usable space is 50% the size of the smallest disk group. I.e. the large disk will be the size of the smallest disk in the set.

Nested RAIDs, e.g. 1+0, 5+0.


RAID-5: When you need a balance of redundancy and disk space or have a mostly random read workload.

Note: RAID-5 requires a minimum of 3 disks. A RAID-5 set will tolerate the loss of a maximum of one drive.

  • Space efficiency is N-1
  • Has a good read performance profile
  • Requires a minimum of 3 disks
  • Tolerates one drive failure
RAID-5, RAID16 (Striping with Distributed Parity)

A RAID-5 volume configuration is an attractive choice for read-intensive applications. RAID-5 uses the concept of bit-by-bit parity to protect against data loss. Parity is computed using Boolean Exclusive OR (XOR) and distributed across all the drives intermixed with the data.

An advantage of RAID-5 is that the plex requires only one additional drive to protect the data. This means RAID-5 is less expensive to run than to mirror all the data drives with RAID-1.

One of the limitations of RAID-5 is that you need a minimum of three disks to calculate parity. In addition, write performance will be poor because every write is going to require a recalculation of parity."

"It uses a simple XOR operation to calculate parity. Upon single drive failure, the information can be reconstructed from the remaining drives using the XOR operation on the known data."

"These RAID levels use a mathematical calculation (an XOR parity calculation) to represent the data across several drives. This tends to be a good compromise between the availability of RAID11 and the capacity efficiency of RAID-0. RAID-5 calculates the parity across the drives in the set and writes the parity to another drive. This parity block calculation with RAID 5 is rotated among the arrays in the RAID15 set."

"One downside to RAID15 is that only one drive can fail in the RAID set. If another drive fails before the failed drive is replaced and rebuilt using the parity data, data loss occurs. The period of exposure to data loss because of the second drive failing should be mitigated."

"One way to protect against data loss in the event of a single drive failure in a RAID-5 set is to use another parity calculation. This type of RAID is called RAID-6"
RAID-5 is bad when: You have a high random write workload or large drives.

"Unfortunately, in the event of a drive failure, the rebuilding process is very IO intensive. The larger the drives in the RAID, the longer the rebuild will take, and the higher the chance for a second drive failure." If you have larger/slower drives, consider RAID-6.

"The necessity of calculating checksums causes a lower write speed. RAID 5 is also expensive in the case of the array reconstruction."


See document(s): how-does-raid-5-work

XOR is often written with the AUT symbol, where AUT is the latin for "Or, but not both".

If bits A and B are both True or both False, the XOR is False.
If bits A and B are both different, the XOR is True.

The Truth Table for XOR:

"Now let us assume we have 3 drives with the following bits:
| 101 | 010 | 011 |

And we calculate XOR of those data and place it on 4th drive:
XOR (101, 010, 011) = 100     (XOR (101,010) = 111 and then XOR (111, 011) = 100

So the data on the four drives looks like this below:
| 101 | 010 | 011 | 100 |

Now let’s see how the XOR MAGIC works. Let’s assume the second drive has failed. When we calculate XOR all the remaining data will be present from the missing drive.
| 101 | 010 | 011 | 100 |
XOR (101, 011, 100) = 010

You can check the missing other drives and XOR of the remaining data will always give you exactly the data of your missing drive.
| 101 | 010 | 011 | 100 |
XOR (101, 010, 100) = 011

What works for 3 bits and 4 drives only, works for any number of bits and any number of drives. Real RAID 5 has the most common stripe size of 64k (65536 * 8 = 524288 bits )

So the real XOR engine only needs to deal with 524288 bits and not 3 bits as in our exercise. This is why the RAID 5 needs a very efficient XOR engine in order to calculate it fast."


RAID - Redundant Array of Independent Disks
RAID-6 contains two independent checksums.

  • Space efficiency is N-2
  • Has a good read performance profile
  • Requires a minimum of 3 disks
  • Tolerates two drive failures
"RAID 6 is similar to RAID 5 but it uses two disks worth of parity instead of just one (the first is Exclusive OR - XOR, the second is a Linear Feedback Shift Register - LFSR), so you can lose two disks from the array with no data loss. The write penalty is higher than RAID 5 and you have one less disk of space."

"RAID 6 uses two different functions to calculate the parity."

"For a RAID6 it is not enough just to add one more XOR function. If two disks in a RAID6 array fail, it is not possible to determine data blocks location using the XOR function alone. Thus in addition to the XOR function, RAID6 arrays utilize Reed-Solomon code that produces different values depending on the location of the data blocks."


No comments:

Post a Comment