This article is an introduction to (or basic review of) storage options utilizing multiple computer disk drives.
RAID, (Redundant Array of Inexpensive Disks, or often now, Redundant Array of Independent Disks) encompasses an industry-standard set of enhanced data storage technologies. RAID combines the storage resources of several physical disk drives into a single logical device recognized by the computer’s operating system. It comes in several standard implementations called levels, each of which has different cost/benefit trade-offs, including disk read/write performance and resiliency.
Common RAID Levels
A RAID 0 configuration joins multiple physical drives into a single logical drive having space equal to the sum of the constituent drives. It utilizes a process called striping to write a segment of data, or stripe, on the first disk, puts the next segment on the second disk, and so on, until the last disk in the array. The process repeats for all subsequent segments, laying them down in a round-robin fashion.
This configuration offers improved read/write performance over a single drive (or other RAID configurations), but it offers no data protection in case a drive crashes; in fact, the loss of any drive in the set results in the loss of the whole set. RAID 0 takes a minimum of two drives.
RAID 1 uses a process called mirroring to create a redundant copy of data on each drive that is a member of the array. Because RAID 1 duplicates data, the total useful capacity is half of the drive total, compared to RAID 0. So, for example, two 1 TB drives, configured as RAID 1, can store a total of only 1 TB. In the case of one drive’s failure, however, you can still access your data from the remaining drive.
RAID 5 works similarly to RAID 0 striping, but it also creates an extra piece of data called parity that is mathematically derived from existing data on the other drives. This parity data, distributed evenly among all drives, allows for the recalculation of the original data if that data is not accessible, as in the case of a drive failure. It has a similar resiliency to RAID 1–in that the array can operate if one drive fails–while offering some of the speed increase of a RAID 0. RAID 5 requires at least three physical drives.
Similar in many respects to level 5, RAID level 6 adds extra parity information, allowing up to two drives to fail without impacting system availability. RAID 6 requires a minimum of four physical drives.
You can also combine RAID levels to obtain additional benefits. This technique, called nested RAID, merges physical drives with one RAID level, and joins the resulting logical drives into another. Nested RAID levels are written as two and three-digit numbers: the first digit is the “innermost” level that governs the physical drives, and the next digits denote how the logical drives are combined. The nested RAID levels listed below are frequently used examples, though several others are possible.
RAID level 10, also written as “1+0”, combines the techniques and benefits of levels 1 and 0. In RAID 10, you configure multiple RAID 1 mirrored disk sets, then join them into a single logical RAID 0 drive. For example, with four drives, you create two sets of Level 1 RAID logical drives consisting of two physical drives each. The two logical drives are then combined to create a single RAID 0 drive. RAID 10 has two main benefits: continued operation despite multiple drive failures, and fast I/O processing. The mirrored RAID 1 sets each tolerate a single drive failure–although if both drives in one of the RAID 1 configurations fail, the whole set fails.
Level 50 is a combination of levels 5 and 0. Here, several level 5 sets are elements of a single RAID 0 logical drive. Each of the level 5 sets can survive the failure of an individual drive. The total set can survive the failure of two or more drives, as long as none of the level 5 sets has more than one failed drive. For example, you configure nine drives as three level 5 groups of three drives each. Each of these groups can continue despite a single-drive failure, so the whole set of nine can handle up to three drive failures as long as it doesn’t exceed one per group.
RAID level 100, or 1+0+0, uses RAID 1 mirrored disks combined into two or more RAID 0 sets. The RAID 0 sets are themselves combined again with an outer RAID 0 into a single logical drive. Although expensive in terms of disk overhead, with the mirroring taking 50 percent of available space, it offers significant performance advantages over other techniques. RAID 100 is well-suited to very large and highly-active databases where speed and uptime are important. RAID 100 requires a minimum of 8 drives: you begin by creating four RAID 1 drives, then merge each pair of RAID 1 drives into two sets of RAID 0, and finally join the two RAID 0 drives with RAID 0 again into a single logical drive.
Other RAID Levels, Uncommon or Obsolete
RAID levels 2, 3, 4 and 7 also exist but are either not in common use or are obsolete. Level 2, for example, required a complex drive mechanism synchronization, increasing costs and leading to its virtual abandonment. Level 7 is a proprietary standard developed by Storage Computer Corporation, which has since gone out of business. Levels 3 and 4 are similar to level 5, though less common.
Hardware and Software
Both hardware and software approaches exist for implementing RAID. Software methods rely largely on an operating system’s built-in disk management facilities, such as those offered by Microsoft’s Windows Server, Apple’s Mac OS X, and Linux. However, when you use the software approach for RAID, it increases the server’s CPU workload, which can affect overall system performance.
The hardware alternative to RAID uses an intelligent drive controller with its own CPU and memory. This approach places little to no extra burden on the main CPU but adds cost to the server hardware. When planning a RAID-based system, check your hardware and software to ensure they support the RAID level you want to implement.
RAID and SSD
RAID techniques work with either traditional hard disk drives (HDDs) or solid-state drives (SSDs). When RAID 0 is applied to multiple SSDs, I/O performance gains can be striking. Performance increases with level 5 and SSD can be complicated, however. A RAID 5 array can be slower than a single SSD for write operations, as the drives are writing parity as well as user data.
RAID and Backups
RAID is not a substitute for regular data backups. Although most RAID levels reduce downtime and take the sting out of most drive failure situations, it cannot compensate for the loss of individual files, such as from human error or corruption or from system-wide losses due to fire or other physical catastrophes. It pays to think of RAID not as a cure-all but as an additional tool for improving server reliability and availability.
RAID is a data storage technology that joins multiple physical drives (HDD or SSD) into a single unit. Depending on how RAID is implemented, it can offer markedly improved I/O speed, reduced downtime, or a combination of the two. Knowing what the various levels offer can help you to determine the implementation that works best for your data storage needs.