There are many applications, particularly in a business environment, where there are needs beyond what can be fulfilled by a single hard disk, regardless of its size, performance or quality level. Many businesses can't afford to have their systems go down for even an hour in the event of a disk failure. They need large storage subsystems with capacities in the terabytes. And they want to be able to insulate themselves from hardware failures to any extent possible. Some people working with multimedia files need fast data transfer exceeding what current drives can deliver, without spending a fortune on specialty drives. These situations require that the traditional “one hard disk per system” model be set aside and a new system employed. This technique is called Redundant Arrays of Inexpensive Disks or RAID. (“Inexpensive” is sometimes replaced with “Independent”, but the former term is the one that was used when the term “RAID” was first coined by the researchers at the University of California at Berkeley, who first investigated the use of multiple-drive arrays in 1987. See D. Patterson, G. Gibson, and R. Katz. “A Case for Redundant Array of Inexpensive Disks (RAID)”, Proceedings of ACM SIGMOD '88, pages 109-116, June 1988.
The fundamental structure of RAID is the array. An array is a collection of drives that is configured, formatted and managed in a particular way. The number of drives in the array, and the way that data is split between them, is what determines the RAID level, the capacity of the array, and its overall performance and data protection characteristics.
An array appears to the operating system to be a single logical hard disk. RAID employs the technique of “striping”, which involves partitioning each drive's storage space into units ranging from a sector (512 bytes) up to several megabytes. The stripes of all the disks are interleaved and addressed in order.
In a single-user system where large records, such as medical or other scientific images, are stored, the stripes are typically set up to be relatively small (perhaps 64 k bytes) so that a single record often spans all disks and can be accessed quickly by reading all disks at the same time.
In a multi-user system, better performance requires establishing a stripe wide enough to hold the typical or maximum size record. This allows overlapped disk I/O (Input/Output) across drives.
Most modern, mid-range to high-end disk storage systems are arranged as RAID configurations.
One description of RAID types can be found at
http://searchstorage.techtarget.com/sDefinition/0,,sid5_gci214332,00.h tml.
A number of RAID levels are known. JBOD stands for Just a Bunch of Drives. The controller treats one or more disks or unused space on a disk as a single array. JBOD provides the ability to concatenate storage from various drives regardless of the size of the space on those drives. JBOD is useful in scavenging space on drives unused by other arrays. JBOD does not provide any performance or data redundancy benefits.
RAID0, or striping, provides the highest performance but no data redundancy. Data in the array is striped (i.e. distributed) across several physical drives. RAID0 arrays are useful for holding information such as the operating system paging file where performance is extremely important but redundancy is not.
RAID1, or mirroring, mirrors the data stored in one physical drive to another. RAID1 is useful when there are only a small number of drives available and data integrity is more important than storage capacity.
RAID1n, or n-way mirroring, mirrors the data stored in one hard drive to several hard drives. This array type will provide superior data redundancy because there will be three or more copies of the data and this type is useful when creating backup copies of an array. This array type is however expensive, in both performance and the amount of disk space necessary to create the array type.
RAID10 is also known as RAID(0+1) or striped mirror sets. This array type combines mirrors and stripe sets. RAID10 allows multiple drive failures, up to 1 failure in each mirror that has been striped. This array type offers better performance than a simple mirror because of the extra drives. RAID10 requires twice the disk space of RAID0 in order to offer redundancy.
RAID10n stripes multiple n-way mirror sets. RAID10n allows multiple drive failures per mirror set, up to n−1 failures in each mirror set that has been striped, where n is the number of drives in each mirror set. This array type is useful in creating exact copies of an array's data using the split command. This array type offers better random read performance than a RAID10 array, but uses more disk space.
RAID5, also known as a stripe with parity, stripes data as well as parity across all drives in the array. Parity information is interspersed across the drive array. In the event of a failure, the controller can rebuild the lost data of the failed drive from the other surviving drives. This array type offers exceptional read performance as well as redundancy. In general, write performance is not an issue due to the tendency of operating systems to perform many more reads than writes. This array type requires only one extra disk to offer redundancy. For most systems with four or more disks, this is the correct choice as array type.
RAID50 is also known as striped RAID5 sets. Parity information is interspersed across each RAID5 set in the array. This array type offers good read performance as well as redundancy. A 6-drive array will provide the user with 2 striped 3-drive RAID5 sets. Generally, RAID50 is useful in very large arrays, arrays with 10 or more drives.
Thus RAID or Redundant Array of Independent Disks are simply several disks that are grouped together in various organizations to either improve the performance or the reliability of a computer's storage system. These disks are grouped and organized by a RAID controller.
Each conventional RAID controller has a unique way to layout the disks and store the configuration information. On the other hand, a system controlled by a common operating system has a known format. When users try to add a RAID controller to their system, the most important task is to migrate the existing data disks to a RAID controlled system. The common operating system configuration format to control and communicate with a disk in the system is referred to as “metadata”. The OS metadata is different from the RAID controller's unique configuration format which is also referred to as “metadata”. Because the OS metadata is different from the RAID controller metadata there is a conflict in recognizing the different formats. Hence, backing up existing data and then restoring it is the common method to migrate existing user data. This however requires system downtime where the user has no access to the data (which can sometimes be up to a day depending on the volume of data being migrated).
What is required is a method to obviate the need to backup and restore existing data and eliminate any system downtime needed for migrating existing user data to a RAID system.