1. Field of the Invention
This invention relates to digital storage systems in which storage is provided by an array of storage devices.
2. Description of the Prior Art
The use of arrays of disk or other Direct Access Storage Devices (DASD) is known, and has provided large storage capacities and higher reliability at lower costs than have been achieved with single disk drives.
U.S. Pat. No. 4,870,643 teaches an array of standard five and one quarter inch disk drives mounted in a rack and panel frame using the Small Computer Storage Interface (SCSI). Data words are each divided into n segments, and each segment is transferred to one of n different drives in parallel, which speeds up the word transfer rate.
At least one disk drive stores parity check information which is used to regenerate the data on any one disk drive that may have failed. When a drive fails, an operator unplugs the failing drive from the frame, and substitutes a working drive. The regenerated data is then written on the replacement drive. The drives are operated, in synchronism, with a signal from the master controller, rather than being synchronized to one of the disks.
U.S. Pat. No. 4,989,206 teaches an array of the type described in U.S. Pat. No. 4,878,643 which includes more drives in the array than needed to store the data and parity. When a drive fails, the system replaces the failing drive with a working drive by means of reconnecting the drives through a cross point switch.
The system includes control modules having a processor and cache memory. Each control module divides the data word that it receives from the computer into the n segments to be written on n drives, and generates the parity segment to be written on the parity drive.
IBM Technical Disclosure Bulletin, Volume 32, Number 7, Dec. 1989, page 5, teaches an improvement in DASD array systems used with the IBM System 38 and IBM System 370.
In these systems, the checksum for corresponding DASD blocks in n drives was calculated in the CPU. In this teaching, the checksum is calculated in the I/O subsystem channel and cache in order to reduce CPU time needed to prepare the checksum record. Also, in this disclosure, data is not spread among the n drives, but is written to one of the drives. The checksum is still calculated across all of the n drives to provide error recovery for all of the drives. The checksum is not calculated from data in all of the drives each time one of the drives is written with a block of data.
Instead, the checksum is updated by Exclusive Oring the old block of data in the drive to be written to remove its effect, and then Exclusive Oring the new data to obtain the new checksum. When these Ex-Or operations are done by the CPU, it often works through it's cache memory, thereby filling the cache with long operands that are used only once. Calculations by the I/O subsystem are done directly to memory and, therefore, cache is preserved and CPU time is reduced.
IBM Technical Disclosure Bulletin, Volume 32, Number 6B, November 1989, page 48, teaches distributing the checksum information across each of the disk drives instead of storing it on one of the drives. This has the effect of improving reliability because the checksum must be rewritten whenever any block of data on any drive is written. If a separate checksum drive is provided, that drive operates when any of the other drives are written, and so may become backlogged with new parity blocks to be written, and also it will wear out sooner than the others.
The above-described prior art has improved the reliability of low cost disk arrays, but has also created some problems. As mentioned in the December 1989, IBM Technical Disclosure Bulletin, the use of the I/O subsystem is recognized as a potential source of an I/O bottleneck.
This problem is further recognized by the Redundant Array of Inexpensive Disks (RAID) Advisory Board, Inc. of LinoLakes Minn. They have published the "RAIDBook", which is incorporated herein by reference, as a source of information for RAID technology. In it, they describe RAID Level 5 as a partial solution to the write parity bottleneck that may occur in RAID Level 4. RAID Level 4 is analogous to the teachings of the above December 1989 IBM publication, and RAID Level 5 is analogous to the November 1989 IBM publication. bottleneck. The bottleneck is accentuated by the fact that the host must either send write commands to all of array is controlled by Array Management Software operating in the host computer. The known alternative of providing an I/O controller is more expensive and, as described in the prior art above, may itself become a bottleneck. The bottleneck is accentuated by the fact that the host must either send write commands to all of the disks, as in RAID Levels 2 and 3, or must read the old data and the old parity in order to generate new parity for new data and then write the new data and new parity as in RAID Levels 4 and 5.