As the performance of microprocessors and the level of semiconductor memory technology increases, there is a need for improved data storage systems with comparable performance enhancements. Additionally, as the performance of data storage systems improves, there is a corresponding need for improved reliability of data stored.
In 1988, in a paper entitled “A Case for Redundant Arrays of Inexpensive Disks (RAID),” a research group at the University of California at Berkeley presented a storage system utilizing a redundant array of storage disk drives that would not only improve performance (e.g., faster data transfer and data I/O), but would also provide higher reliability at a lower cost. RAID improves performance by disk striping, which interleaves bytes or groups of bytes across the array of disks, so more than one disk can be reading and writing simultaneously. Fault tolerance is achieved by mirroring or parity. Mirroring is the complete duplication of the data on two drives. Parity data is generated by taking the contents of all storage units subject to failure and “Exclusive OR'ing” (XOR'ing) them. The resulting parity data is stored with the original data, and may be used to recover lost data due to a disk failure.
A host, such as a computer system, typically views a RAID as a single disk although RAID includes an array of disks. A RAID controller may be a hardware and/or software tool for providing an interface between the host and the array of disks. The RAID controller manages the array of disks for storage and retrieval and can view the disks of the RAID separately. The manner in which the controller manages the disks is defined by a particular “RAID level.” The RAID level defines how the data is distributed across the disk drives and how error correction is accomplished. For instance, RAID level 0 provides disk striping only, which improves performance, but provides no reliability or fault tolerance. RAID level 5 (RAID 5), on the other hand, is characterized by striping data across three or more drives and generating parity data which is distributed across all disks. Thus, RAID 5 offers high performance as well as high reliability.
As stated above, RAID storage systems improve performance by striping data over all of the data disks in the array. The portion of a stripe of data in one disk is known as a “stripe unit.” Thus, the size of a “stripe unit” will be the size of the stripe divided by the number of disks in the array. The “stripe unit” is further broken down into a plurality of “sectors,” where all sectors are an equivalent predefined size.
Disk arrays are preferably configured to include logical drives that divide the physical drives into logical components, which may be viewed by the host as separate drives. In other words, from the perspective of the host, the logical drive is a single storage unit, while in reality, the logical drive represents an array of physical drives. The logical drive is divided into a plurality of storage blocks, each block being identified by a logical address. When the host issues commands, e.g. READ or WRITE, to its logical drive, the commands will designate the logical address of the data, and not the physical drive.
Each logical drive includes a cross section of each of the physical drives. So, for example, FIG. 4 shows one logical drive 160 spanning across four physical drives 152, 154, 156, 158 of a RAID 5 array. The host assigned to that logical drive 160 will have access to data stored in stripes 164, 166, 168 in the logical drive 160. In addition, each logical drive is assigned a RAID level. Thus, as is seen in FIG. 4, the data in each stripe 164, 166, 168 in the logical drive 160 of the RAID 5 array is interleaved across the cross sections of three of the four physical drives and parity data is stored in the cross section of the fourth drive.
The discussion above generally describes a single-tier, or non-hierarchical, RAID system. FIG. 1A is a schematic diagram illustrating a non-hierarchical RAID 0 system 10. In such a system, the logical drive is hard coded to assume that the physical drives 30 are the lowest component making up the RAID 0 array 20, and that the highest level is the host operating system 11. A hierarchical RAID system would conceptually comprise of multiple tiers of different RAID levels. For example, in FIG. 1B, a RAID 50 logical drive 10′ would comprise of a RAID 0 array 20′ of m RAID 5 arrays 40. The two-level relationship of hierarchical RAID is hard-coded to assume the top level is the host 11′, which leads to the RAID 0 array 20′, which in turn breaks into m sub-RAID levels 40, which in turn, each break into n physical drives 30′.
A hierarchical approach to a RAID array is advantageous because the storage system can be expanded to a large number of physical storage devices, such as disk drives, while maintaining a high performance level. For instance, as shown in FIG. 2A, in a non-hierarchical RAID 5 array having 45 drives, data for a stripe is distributed across 44 drives, and the remaining drive receives the parity data for the stripe. Data from other stripes are similarly distributed, except that the drive receiving the parity data rotates, such that all parity data is not contained on a single drive. Thus, the stripe data and the parity data are distributed across all drives. If one drive fails, the system must read data from the other 44 drives in order to rebuild the degraded drive. Thus, as the number of drives increases, the ability to rebuild a failed drive efficiently deteriorates.
In FIG. 2B, a hierarchical RAID 50 array is illustrated. Here, the RAID 5 array having 45 drives of FIG. 2A can be represented as a hierarchical RAID 50 array comprising a RAID 0 array 50 of 3 RAID 5 arrays (60a, 60b, 60c), each having a set of 15 drives (70a, 70b, 70c). The same 45 drives are available, but now, rebuilding a failed drive, for example drive 2 in set 70a, entails reading only from the other 14 drives in the set 70a, as opposed to all 44 drives. Thus, performance is improved.
While a hierarchical approach to RAID arrays is desirable, implementation of such a system is tedious and time consuming. For example, as stated above, to implement a RAID 50 array, the developer is required to hard-code every operation, where the input is presumably from a host and the output is presumably sent directly to a disk. The code would change for every hierarchical configuration. Thus, the hardcode for a hierarchical RAID 50 array would be different from code of a RAID 51 array. If the hierarchy went beyond two levels, the code would become much more complicated and development costs would be prohibitive. Testing and debugging would also be cumbersome.
Accordingly, a need exists for a more efficient and less complex way for implementing a hierarchical RAID system with minimal or no additional development efforts, i.e. no new code. The present invention addresses such a need.