1. Field of the Invention
The present invention relates generally to data storage systems employing mirrored data storage for redundancy and improved access speed.
2. Description of the Related Art
The ever-increasing speed of central processing units has created an increasing demand for high-speed, high-capacity data storage. Fortunately, improvements in data storage and cache memory technology have kept pace with the improvements in central processor technology. Users, however, are also demanding a higher degree of reliability and availability of data storage access.
Redundant data storage is a common technique for providing a desired degree of reliability and availability of data storage access. Currently for most applications the preferred data storage technology is magnetic disk technology that has been developed for the personal computer market. A sufficient number of commodity magnetic disk drives are organized in an array and interfaced to a storage controller and semiconductor cache memory to provide sufficient capacity and data storage access speed for general-purpose computing and database applications. Redundant arrays of such inexpensive disks, known as xe2x80x9cRAID,xe2x80x9d can provide a high degree of reliability and availability of data storage access.
The least complex method of providing redundancy in an array of storage devices is to double the number of storage devices so that the storage devices are organized in pairs, and data on the each storage device in each pair is a copy or mirror image of data on the other storage device in each pair. If a storage device fails, then the data processing system can access the redundant copy until the failed storage device is replaced.
The mirroring approach is compared and contrasted with more complex redundancy schemes in Patterson et al., xe2x80x9cIntroduction to Redundant Arrays of Inexpensive Disks (RAID),xe2x80x9d COMPCON 89 Proceedings, Feb. 27-Mar. 3, 1989, IEEE Computer Society pp. 112-117. Mirrored RAIDs have the highest cost for a given storage capacity, but performance versus a nonredundant disk array depends on the mix of reads and writes. The user must double the number of disks for the same amount of data or, conversely, use only half the real storage capacity of the disks. If the arms and spindles of a pair were synchronized, then the performance of mirroring versus nonredundant disks would be the same. This is not commonly how the mirroring is implemented, and a write results in independent writes to two disks. The writes can be overlapped, but in general one will have longer seek and/or rotational delay. On the other hand, the independence of the disks can improve performance of reads. The system might look at the pair of disks that have the data; if only one is busy, it chooses the other. If both are idle, it picks the disk that has shortest seek.
More complex RAID techniques reduce the number of disks by computing and storing parity of data across a number of disks. Failure of a disk can be detected by a disk controller, and data of a failed disk can be computed using the parity. By calculating and storing parity of a group of disks on a bit per disk basis, for example, any single disk failure can be corrected simply by reading the rest of the disks in the group to determine what bit value on the failed disk would give the proper parity. Such a N+1 RAID scheme can lose data only if there is a second failure in the group before the failed drive is replaced. This scheme has much lower cost and overhead, with the customer deciding how much overhead he wants to pay by increasing the number of disks in the parity group. Performance depends not only on the mix of reads and writes, but also on the size of the access. Since there is ECC information on each sector, read performance is essentially the same as non-redundant disk arrays. For xe2x80x9clargexe2x80x9d writesxe2x80x94writing to at least a sector to every disk in the parity groupxe2x80x94the only performance hit is 1/N more writes to write the parity information. Writes to data on a single disk, on the other hand, require four disk accesses, including a read and a write to the parity information. To avoid a bottleneck that would be caused by the additional access to the parity information, the parity is spread over several disks. (Patterson et al., p. 113.)
The inventors have recognized that the conventional mirroring of data storage devices has a failure load problem. For continuous throughput, data is often striped over more than one pair of mirrored disk drives, and the disk drives are accessed substantially simultaneously for continuous throughput. If the disk drives were mirrored in pairs, a failed disk drive would become a bottleneck to simultaneous access.
This failure load problem is solved by an asymmetrical striping of the mirrored data over the mirrored arrays of data storage devices so that the mirrored data contained in a failed storage device in one of the arrays can be accessed by accessing respective shares of this mirrored data in a plurality of the data storage devices in the other array. In addition, the asymmetrical striping reduces the so-called xe2x80x9crebuildxe2x80x9d time for copying this mirrored data to a replacement for the failed storage device. The mirrored data can be copied to the failed data storage device from more than one other data storage device without substantial interruption of any continuous throughput.
For disk storage devices, the mirrored data can be arranged in the first and second arrays so that the mirrored data is contained at the same disk track radius in both arrays in order to equalize seek time for write access or sequential read access to both arrays. Alternatively, the mirrored data can be arranged so that mirrored data at the minimum and maximum track radius in one array is contained at the mean track radius of the other array in order to minimize seek time for random read access.
In accordance with yet another aspect, the invention provides a data storage system including a first array of data storage devices, a second array of data storage devices, and a storage controller. The storage controller is coupled to the first array of storage devices and the second array of storage devices for accessing mirrored data contained in the first array of data storage devices and also contained in the second array of data storage devices. The storage controller is programmed to respond to a request to access a specified portion of the mirrored data by accessing the specified portion of the mirrored data in the first array of data storage devices when the specified portion of the mirrored data cannot be accessed in the second array of data storage devices, and by accessing the specified portion of the mirrored data in the second array of data storage devices when the specified portion of the mirrored data cannot be accessed in the first array of data storage devices. Each data storage device in the first array of data storage devices contains a respective share of the mirrored data. Each data storage device in the second array of data storage devices contains a respective share of the mirrored data. Each data storage device in the second array of data storage devices contains a respective share of the mirrored data contained in each data storage device in the first array of data storage devices. Moreover, each data storage device in the first array of data storage devices contains a respective share of the mirrored data contained in each data storage device in the second array of data storage devices.
In accordance with still another aspect of the invention, the mirrored data contained in the first array of data storage devices and also contained in the second array of data storage devices is subdivided into respective data blocks. Each data storage device in the first array of data storage devices contains the same number of the data blocks so that the data blocks are cells in a first matrix. Each data storage device in the second array of data storage devices contains the same number of the data blocks so that the data blocks are cells in a second matrix, and the second matrix is the transpose of the first matrix.
In a preferred implementation, the data storage devices in the first and second arrays of data storage devices contain rotating disks, and the mirrored data is contained at different radii on the rotating disks. Each of the data blocks contained in each of the data storage devices is contained at radii over a respective range of radii on each of the data storage devices. The data blocks in each row of the first matrix are contained in the data storage devices in the first array of data storage devices at the same range of radii. Moreover, the data blocks in each row of the second matrix are contained in the data storage devices in the second array of data storage devices at the same range of radii.
In a preferred implementation, the data storage devices in the first and second arrays of data storage devices contain rotating disks, and the mirrored data is stored in even and odd numbered circular tracks on planar surfaces of the rotating disks. The storage controller is further programmed for a read access to the specified portion of the mirrored data by issuing a read command to one of the data storage devices in the first array of data storage devices for reading one half of the specified portion of the mirrored data from odd numbered tracks and by issuing a read command to one of the data storage devices in the second array of data storage devices for concurrently reading another half of the specified portion of the mirrored data from even numbered tracks.