Patterson et al., in "A Case for Redundant Arrays of Inexpensive Disks (RAID)", ACM Sigmod Conference, Chicago, Ill., Jun. 1-3, 1988 pages 109-116, indicate that the capacity of single, large, expensive magnetic disk drives has grown rapidly, but their performance improvements have been modest. They suggest that redundant arrays of inexpensive disks (RAID) offer an attractive alternative to the large expensive disks, and promise performance improvements of an order of magnitude. The drawback of disk drive arrays is that the mean time to failure of any single disk drive in an array is such as to render the array subject to failure at intervals unacceptable to today's user. In fact, they conclude that without fault tolerance, large arrays of inexpensive disks are so unreliable as to be useless.
As a solution to the reliability problem, Patterson et al., postulate five levels of RAID system redundancies that either partially or largely overcome the reliability problem. Each of the five proposed systems enables a failed disk drive to be rapidly replaced and the data that it contained, reproduced on the replacement disk drive. In addition, the data on the failed drive is made available to the host processor, even before it is written to the replacement disk drive.
In the ensuing description of various levels of RAID systems, the concept of "striping" will be referred to. In essence, striping refers to the interleaving of data across a plurality of disks. The interleaving may be by bit, byte, word, or block, with succeeding data elements placed upon succeeding disk drives in a "stripe" arrangement. For example, assuming a four disk drive array, word one would be placed upon disk drive one, word two on disk drive two, words three and four on disk drives three and four, word five on disk drive one, etc. In the case of RAID systems that employ parity redundancy, an extra disk drive (or drives) contain parity information for each respective stripe. The loss of a disk drive does not hinder the replacement of its data as it can be recovered using the parity data.
Patterson et al. postulated the following levels of RAID:
first level RAID: mirrored disks--this system structure is the traditional approach wherein all disk drives are duplicated and every write to a disk is also written to a check disk. PA1 second level RAID: this arrangement contemplates bit-interleaving the data across the disks of a group of disk drives and then adding enough check disks to detect and correct a single bit error on any disk. PA1 third level RAID: data is striped across a plurality of disks and a redundant parity disk drive is provided. Information on a failed disk can be reconstructed by calculating the parity of the remaining good disks and then comparing, bit by bit, to the parity calculated for the original full group of disk drives. Third level RAID brings the reliability overhead cost to its lowest level and provides excellent performance characteristics. Another aspect of RAID levels two and three is that all disks with data are involved in each data transfer. PA1 fourth level RAID: This system arrangement suggests keeping larger individual data units (e.g., blocks of data) on a single disk, writing sequential blocks across a plurality of disks, calculating parity for the blocks and placing the parity block on an additional parity disk drive. Raid level four enables a certain level of parallelism to be achieved in the reading actions on the various disks. PA1 fifth level RAID: No single disk drive is assigned as the repository for check or parity characters in this system configuration. Parity is distributed amongst all of the disks in the array. Thus, assuming a five disk array, the first stripe starts on disk one and the parity for stripe one is placed on disk five. The parity for stripe two is placed on disk one, and its first data segment commences on disk two, etc. This level also enables parallel reading and writing to the disks.
Subsequent to the Patterson et al. paper, a RAID zero level has been postulated wherein data is striped across a plurality of disk drives, with no provision for insertion of parity. Of the described RAID levels, levels zero, one, three and five have become the most widely used. However, prior art array controller structures, because of the complex data arrangements required by RAID three and five, have exhibited substantial inflexibility in regards to any change in arrangement of connected disk drives. Furthermore, array controllers for such RAID arrangements have generally exhibited one device port per disk drive, increasing the cost of such controllers.
Prior art array controllers operate in both a buffered and non-buffered mode. The ADP-92 disk array controller marketed by the NCR Corporation, Dayton, Ohio does not buffer data between a host processor and disk drive array. The controller distributes incoming data to each of the drives as the data is received from the host. The ADP-92 includes a device port controller module for each disk drive that adds expense to the controllers structure.
Other array controllers employ a single-ported buffer that temporarily stores data being transferred between disks and a host computer. The advantage of the buffer is that the host port is able to transfer data at its maximum rate in block-striped modes, whereas a controller without a buffer cannot. Likewise, the buffer enables the controller to take advantage of overlaps and latencies of devices in the host channel. Such controllers are, by their nature however, more expensive to implement and, further, single-ported buffers present a constraint on data transfers due to their single-port structure.
Accordingly, it is an object of this invention to provide an controller for an array of disk drives which has programmable flexibility to handle variable numbers of disk drives.
It is another object of this invention to provide a programmable array controller that is capable of implementing various RAID level arrangements, while maintaining a minimal number of device ports.
It is yet another object of this invention to provide a buffered disk array controller that avoids the performance penalties that arise from the use of a single-port buffer.