RAID - Misplaced Pages

This is an old revision of this page, as edited by Iluvcapra (talk | contribs) at 05:33, 6 November 2003 (Added RAID 6). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 05:33, 6 November 2003 by Iluvcapra (talk | contribs) (Added RAID 6)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

The goal of a redundant array of independent disks (originally known as a redundant array of inexpensive disks) -- or RAID -- is to provide large reliable virtual disks that can be much larger than commonly available disk drives.

There are 7 official levels: RAID 0 to RAID 6. There can also be combinations of RAID levels, the most common combinations are RAID 10 and RAID 0+1.

RAID arrays are usually implemented with identically-sized disk drives.

Hardware vs. Software

Any of the RAID levels listed below can be implemented in hardware or software.

With a software implementation, the operating system itself manages the disks of the array through the normal drive controller (IDE, SCSI, FC). This option can be slow, but it does not require the purchase of extra hardware.

A hardware implementation of RAID requires (at a minimum) a special-purpose RAID controller card. This controller handles the management of the disks, and performs parity calculations (needed for RAID 4, 5). This option tends to provide better performance, and makes operating system support easier.

Hardware implementations also typically support hot swap, allowing failed drives to be replaced while the system is running.

RAID levels

RAID 0: Striped Disk Array without Fault Tolerance (Nonredundant)

RAID Level 0 requires a minimum of 2 drives to implement.

Characteristics and Advantages

RAID 0 implements a striped disk array, the data is broken down into blocks and each block is written to a separate disk drive. I/O performance is greatly improved by spreading the I/O load across many channels and drives.

Best performance is achieved when data is striped across multiple controllers with only one drive per controller. No parity calculation overhead is involved. Very simple design, easy to implement.

Disadvantages

Not a "True" RAID because it is not fault-tolerant. The failure of just one drive will result in all data in an array being lost. Should never be used in mission critical environments that involve modification of data. (Some applications work with control information stored on a RAID 1 or 5 filesystem and multimedia data stored on RAID 0 and backed up to tape or optical media.)

Recommended Applications

Video Production and Editing
Image Editing
Pre-Press Applications
Any application requiring high bandwidth

RAID 1: Mirroring and Duplexing (Mirrored)

For Highest performance, the controller must be able to perform two concurrent separate reads per mirrored pair or two duplicate writes per mirrored pair.

RAID Level 1 requires a minimum of 2 drives to implement

Characteristics

One write or two reads possible per mirrored pair. Twice the read transaction rate of single disks, same write transaction rate as single disks. 100% redundancy of data means no rebuild is necessary in case of a disk failure, just a copy to the replacement disk.

Transfer rate per block is equal to that of a single disk Under certain circumstances, RAID 1 can sustain multiple simultaneous drive failures.

Simplest RAID storage subsystem design.

Advantages

Since a disk of a mirrored pair has all the information, it can potentially be used without the RAID hardware/software.

Disadvantages

Highest disk overhead of all RAID types (100%) inefficient.

Recommended Applications

Accounting
Payroll
Financial
Any application requiring very high availability

RAID 2: Error-Correcting Coding

Redundancy scheme in RAID Level 2 is Hamming code, where the striping unit is a single bit. Striping at the bit level has the implication that in a disk array with D data disks, the smallest unit of transfer for a read is a set of D blocks.

RAID level 2 is rarely implemented.

RAID 3: Bit-Interleaved Parity (Richard M. Price Parity)

RAID level 3 has a single check disk and only processes one I/O at a time.

RAID level 3 is rarely implemented.

RAID 4: Dedicated parity drive (Block-Interleaved Parity)

Characterisitcs

Disks are striped, as in RAID 0. Parity information for the stripe is calculated, and stored on a parity disk. If one of the data disks fails, the information is re-built on a spare disk using the parity information. If the parity disk fails, the parity information is recalculated on a spare disk.

Disadvantages

The parity drive can be a bottleneck during write operations.

RAID 5: Independent Data disks with distributed parity blocks (Block Interleaved Distributed Parity)

Each entire data block is written on a data disk; parity for blocks in the same rank is generated on Writes, recorded in a distributed location and checked on Reads.

RAID Level 5 requires a minimum of 3 drives to implement.

Characteristics and Advantages

Highest read data transaction rate. Medium to poor write data transaction rate, especially when the host CPU performs software parity checking. Low ratio of ECC (Parity) disks to data disks means high efficiency. Good aggregate transfer rate.

Disadvantages

Disk failure has a medium impact on throughput. Most complex controller design. Difficult to rebuild in the event of a disk failure (as compared to RAID level 1). Individual block data transfer rate same as single disk. High overhead for small writes. To change 1 byte in a file, the entire stripe must be read, the byte changed, the parity information re-calculated, and the entire stripe re-written. However, the fact that file systems tend to address disks naturally in clusters partially hides this effect.

Recommended Applications

File and Application servers
Database servers
WWW, E-mail, and News servers
Intranet servers
Most versatile RAID level

RAID 6: Independent Data Disks with Double Parity

Entire data block is written to data disk; parity is generated and written to two distributed parity strips, on two separate drives.

RAID level 6 requires a minimum of three drives, but four are required to exceed RAID 1 space efficiency.

Characteristics

The most redundant parity array, very inefficient with low count of drives, but much more fault tolerant. Drives can be organized into orthogonal matricies, where rows of drives form parity groups, similar to RAID 5, while the columns also keep consistent parity data with each other. If a single drive fails, either its row or column parity may be used to rebuild it. Serveral drives on any one column or row may fail before the array is corrupt. Any group of non-coincident drives may fail before the array is corrupt.

RAID 10: A Stripe of Mirrors

Multiple RAID 1 mirrors are created, and a RAID 0 stripe is created over these. This is none of the original 6 levels, but a combination of RAID 1 and 0, sometimes also called RAID 1+0.

Advantages

Can potentially handle multiple simultaneous disk failures, as long as at least one disk of each mirrored pair is working.

Same advantages and disadvantages of RAID 1.

RAID 0+1: A Mirror of Stripes

Two RAID 0 stripes are created, and a RAID 1 mirror is created over them. This also isn't one of the original 6 RAID levels.

Disadvantages

Is not as robust as RAID 0+1. Cannot tolerate two simultaneous disk failures, if not from the same stripe.

History

RAID was first proposed in 1988 by David A. Patterson, Garth A. Gibson and Randy H. Katz in the paper, "A Case for Redundant Arrays of Inexpensive Disks (RAID)". This was published in the SIGMOD Conference 1988: pp 109-116. The term "RAID" started with this paper.

It was particularly ground-breaking work in that the concepts are "obvious". This paper spawned the entire disk array industry.

Also See

Storage Area Network