1. Field of the Invention
This invention relates in general to storage systems, and more particularly to a method, apparatus and program storage device for providing intelligent rebuild order selection is a storage array.
2. Description of Related Art
Computer systems are constantly improving in terms of speed, reliability, and processing capability. As a result, computers are able to handle more complex and sophisticated applications. As computers improve, performance demands placed on mass storage and input/output (I/O) devices increase. There is a continuing need to design mass storage systems that keep pace in terms of performance with evolving computer systems.
A Disk array data storage system has multiple storage disk drive devices, which are arranged and coordinated to form a single mass storage system. There are three primary design criteria for mass storage systems: cost, performance, and availability. It is most desirable to produce memory devices that have a low cost per megabyte, a high input/output performance, and high data availability. “Availability” is the ability to access data stored in the storage system and the ability to insure continued operation in the event of some failure. Typically, data availability is provided through the use of redundancy wherein data, or relationships among data, are stored in multiple locations.
There are two common methods of storing redundant data. According to the first or “mirror” method, data is duplicated and stored in two separate areas of the storage system. For example, in a disk array, the identical data is provided on two separate disks in the disk array. The mirror method has the advantages of high performance and high data availability due to the duplex storing technique. However, the mirror method is also relatively expensive as it effectively doubles the cost of storing data.
In the second or “parity” method, a portion of the storage area is used to store redundant data, but the size of the redundant storage area is less than the storage space used to store the original data. For example, in a disk array having five disks, four disks might be used to store data with the fifth disk being dedicated to storing redundant data. The parity method is advantageous because it is less costly than the mirror method, but it also has lower performance and availability characteristics in comparison to the mirror method.
In a virtual storage system, both the Mirror and the Parity method have the same usage costs in terms of disk space overhead as they do in a non-virtual storage system, but the granularity is such that each physical disk drive in the system can have one or more RAID arrays striped on it as well as both Mirror and Parity methods simultaneously. As such, a single physical disk drive may have data segments of some virtual disks on it as well as parity segments of other physical disks and both data and mirrored segments of other virtual disks.
These two redundant storage methods provide automated recovery from many common failures within the storage subsystem itself due to the use of data redundancy, error codes, and so-called “hot spares” (extra storage modules which may be activated to replace a failed, previously active storage module). These subsystems are typically referred to as redundant arrays of inexpensive (or independent) disks (or more commonly by the acronym RAID). The 1987 publication by David A. Patterson, et al., from University of California at Berkeley entitled A Case for Redundant Arrays of Inexpensive Disks (RAID), reviews the fundamental concepts of RAID technology.
There are five “levels” of standard geometries defined in the Patterson publication. The simplest array, a RAID 1 system, comprises one or more disks for storing data and a number of additional “mirror” disks for storing copies of the information written to the data disks. The remaining RAID levels, identified as RAID 2, 3, 4 and 5 systems, segment the data into portions for storage across several data disks. One of more additional disks are utilized to store error check or parity information. Additional RAID levels have since been developed. For example, RAID 6 is RAID 5 with double parity (or “P+Q Redundancy”). Thus, RAID 6 is an extension of RAID 5 that uses a second independent distributed parity scheme. Data is striped on a block level across a set of drives, and then a second set of parity is calculated and written across all of the drives. This configuration provides extremely high fault tolerance and can sustain several simultaneous drive failures, but it requires an “n+2” number of drives and a very complicated controller design. RAID 10 is a combination of RAID 1 and RAID 0. RAID 10 combines RAID 0 and RAID 1 by striping data across multiple drives without parity, and it mirrors the entire array to a second set of drives. This process delivers fast data access (like RAID 0) and single drive fault tolerance (like RAID 1), but cuts the usable drive space in half. RAID 10, which requires a minimum of four equally sized drives in a non-virtual disk environment and 3 drives of any size in a virtual disk storage system, is the most expensive RAID solution and offers limited scalability in a non-virtual disk environment.
A computing system typically does not require knowledge of the number of storage devices that are being utilized to store the data because another device, the storage subsystem controller, is utilized to control the transfer of data to and from the computing system to the storage devices. The storage subsystem controller and the storage devices are typically called a storage subsystem and the computing system is usually called the host because the computing system initiates requests for data from the storage devices. The storage controller directs data traffic from the host system to one or more non-volatile storage devices. The storage controller may or may not have an intermediate cache to stage data between the non-volatile storage device and the host system.
In a computer system employing the drive array, it is desirable that the drive array remains on-line should a physical drive of the drive array fail. If a main physical drive should fail, drive arrays currently have the capability of allowing a spare physical replacement drive to be rebuilt without having to take the entire drive array off-line. Furthermore, intelligent drive array subsystems currently exist which can rebuild the replacement drive transparent to the computer system and while the drive array is still otherwise operational.
When a disk in a RAID redundancy group fails, the array attempts to rebuild data on the surviving disks of the redundancy group (assuming space is available) in such a way that after the rebuild is finished, the redundancy group can once again withstand a disk failure without data loss. Depending upon system design, the rebuild may be automated or may require user input.
After detecting a disk or component failure and during a rebuild of data, regardless of rebuild design, the system remains subject to yet further disk or component failures before the rebuild is complete. In any RAID system, this is significant because the vulnerability of data loss is dependent upon the RAID architecture.
When multiple disk drives require a rebuild of their data from redundant drives, it is possible that another drive containing the redundant data can be lost during the rebuild causing the loss of user data. The risk of losing data when a subsequent drive is lost is related to the manner in which the redundant data is arranged on the drives. Each drive and each RAID on the drive being rebuilt will have a different risk associated with the loss of another drive.
It can be seen then that there is a need for a method, apparatus and program storage device for providing intelligent rebuild order selection is a storage array.