1. Field of the Invention
The present invention is directed to a method for reducing latency in read operations of a disk drive array while still insuring that valid data is provided therefrom. More particularly, the present invention is directed to a method of transferring data from less than all of the disk drives of the array to a stage buffer memory, checking the integrity thereof and reconstructing the data not transferred if the transferred data is valid. Still further, the present invention takes advantage of a dual parity generation engine's fault tolerance for a loss of valid data from at least any two of the plurality of disk drives, to transfer and integrity check data from N−1 of N disk storage channels to reduce latency of the memory array that would result if the memory array had to wait for a lagging Nth disk drive to finish its individual read operation. The dual parity generation engine is able to identify invalid data present in the N−1 disk storage channels, and if the data is valid, reconstruct the data that was not transferred from the Nth disk drive. The valid data reconstructed by the dual parity generation engine is transferred thereby to a stage buffer memory for subsequent transfer to a processor requesting the data to complete the read operation.
2. Prior Art
Computer systems often employ disk drive devices for storage and retrieval of large amounts of data. In order to increase capacity of the disk memory systems and provide some measure of reliability, the disk drive devices are formed in an array where the data is byte stripped across multiple disk drives, including parity data. To improve the reliability of the disk drive array, the storage system is arranged as a redundant array of disk drives. Redundant arrays of inexpensive disks (RAID), also referred to as redundant arrays of independent disks have grown in usage. In the originally proposed five levels of RAID systems, RAID-5 systems have gained great popularity for use in local area networks and independent personal computer systems, such as for media database systems. In RAID-5, data is interleaved by stripe units across the various disk drives of the array along with error correcting parity information. Unlike RAID-3, wherein data and parity information are stored in dedicated physical disk drives, RAID-5 distributes the data and parity information across all of the disk drives in an interleaved fashion, the data and parity information being stored in logical disk drives. The parity data in a RAID-5 system provides the ability to correct only for a failure of valid data from a single disk drive of the array.
RAID-6 systems have since been developed for data storage systems requiring a greater fault tolerance. In RAID-6, data is interleaved in striped units distributed with parity information across all of the disk drives, as in the RAID-5 system. However, to overcome the disadvantage of RAID-5's inability to correct for faulty data being retrieved for more than one disk drive, the RAID-6 system utilizes a redundancy scheme that can recover from the receipt of invalid data from any two of the disk drives. Although this scheme also uses logical disk drives, an additional disk drive device is added to the array to account for the additional storage required for the second level of parity data required. The RAID-6 parity scheme typically utilizes either a two-dimensional XOR algorithm or a Reed-Solomon code in a P+Q redundancy scheme. Thus, utilizing a RAID-6 architecture, multiple disk data errors in a single redundancy group can be detected, and single disk data errors in the redundancy can be corrected.
In order to provide large data capacity, a large number of disk drives are often arrayed and the additional disk drives required for two or more levels of parity data further increases the total number of disk drives in the array. As these systems send the same command to all of the disk drives, and then wait for all of the disks to finish a command before a new command is sent thereto, the data transfer rate of the memory array is limited by the “slowest” disk drive of the array. That characteristic can be particularly limiting since disk drives often exhibit unduly long access times as they begin a failure process were their performance degrades, sometimes long before they are identified as having failed by the memory system or the drive itself.
Current RAID-3 systems tried to overcome this latency problem by starting data transfers early, before all of the disk drives have completed a read command, so long as the data needed is already in the cache memory or can be reconstructed utilizing parity data. However, RAID-3 systems employing such techniques are unable to verify the integrity of the data being transferred to the initiator when that latency reduction technique is utilized. This method of improving latency is at a cost of data integrity, which is not an acceptable trade-off. Thus, there is a need to provide a method for reducing latency while still preserving the data integrity of the data provided by the memory system.