1. Technical Field
The present invention generally relates to disk array data storage systems, and in particular to a method and system for reconstructing data in response to a disk failure within a disk array data storage system. More particularly, the present invention relates to an efficient mechanism for allocating processing resources during data reconstruction operations.
2. Description of the Related Art
Disk array data storage systems are characterized as having multiple storage disk drives that are arranged and coordinated to form a single mass storage system. Three fundamental design criteria for such mass storage systems include: performance, cost, and data access availability. It is most desirable to produce data storage systems that have a low cost per storage unit, a high input/output performance, and a high data access availability. As utilized herein, xe2x80x9cdata access availabilityxe2x80x9d is a measure of ease with which data stored within a mass storage system is accessed and the level of assurance of continued operation in the event of some system failure (e.g. a disk drive failure). Typically, data access availability is provided through the use of redundancy wherein data, or relationships among data, are stored in multiple locations.
Two of the most common means of implementing redundancy in a disk array storage system are xe2x80x9cmirroringxe2x80x9d and xe2x80x9cparity striping.xe2x80x9d According to the mirror method, data is duplicated and stored on separate areas of the mass storage system. In a disk array, for example, an identical data set is provided on two physically distinct disk drives. The mirror method has the advantages of high performance and high data access availability due to its duplicative storage technique. However, the mirror method is also relatively expensive as it effectively doubles the cost of storing data.
In the second, or xe2x80x9cparity stripingxe2x80x9d method, a portion of the storage area is utilized to store redundant data, but the size of the redundant storage area is less than the 1:1 ratio required for disk mirroring. For example, in a disk array having ten disks, parity striping may permit nine of the disks to be utilized for storing data with the remaining one being dedicated to storing redundant data. The parity striping method is advantageous because it is less expensive that the mirroring method, but it has lower performance and availability characteristics relative to mirroring.
For disk arrays employing either or both mirroring and parity striping, there exists the need to reconstruct data in the case of a disk failure. One such data reconstruction technique in set forth by Morita in U.S. Pat. No. 5,848,229, in which data reconstruction in response to a failed disk unit is described with respect to a parity striped system. The data reconstruction technique described by Morita, as well as other conventional data reconstruction techniques, fail to address the issue of allocation of processing resources that are shared between a host processing system and the devices and programs utilized to implement data reconstruction (referred to hereinafter as reconstruction agents).
Given the nature of the mirror and parity striped redundancy techniques traditionally utilized to provide data access availability, the speed of a data reconstruction operation is important. The faster a data reconstruction operation is performed, the less likely an interim failure on yet another disk will exposed the disk array to a complete failure. Consequently, there exists a need to maximize the processing resources provided to reconstruction agents while maintaining the disk array on-line. The present invention addresses such a need.
A method, apparatus, and program product applicable within a multi-drive data storage system for adaptively allocating data reconstruction resources are disclosed herein. In accordance with the method of the present invention, responsive to a detected drive failure, a resource allocation manager periodically determines the number of pending host system processing requests. The determined number of pending host system processing requests is then compared to a predetermined threshold value. Finally, a number of processing resources are allocated to data reconstruction in accordance with the results of the comparison of the number of pending host system processing requests to the predetermined threshold.
All objects, features, and advantages of the present invention will become apparent in the following detailed written description.