In 1987, David A. Patterson, et al. reported a technique for saving redundant data in storage devices. See David A. Patterson, “A Case for Redundant Arrays of Inexpensive Disks (RAID),” University of California at Berkeley, Computer Science Division Report UCB:CSD-87-391 (December 1987); see also David A. Patterson, “A Case for Redundant Arrays of Inexpensive Disks (RAID),” ACM SIGMOD conference proceeding, Chicago, Ill., Jun. 1–3, 1988, pp. 109–116. This technique is based on a generalized method of concatenating multiple physical storage devices into one logical storage unit. When a high level device (e.g., host computer or dedicated controller) writes user data to this type of logical storage unit, the data may be divided into a number of parts corresponding to the number of physical storage devices. At the same time, redundant data may also be generated and stored among the several physical storage devices. In the event that one of the physical storage devices fails, the stored redundant data can facilitate the recovery of stored user data. The Patterson document described the following two methods for generating redundant data.
The first method for generating redundant data, referred to herein as the “method of read and modify,” generates redundant data from the combination of three sources: (1) write data from a high level device, (2) previous data stored in a storage device where the write data is to be stored, and (3) previous redundant data stored in a storage device where newly generated redundant data is to be stored.
Assuming that the write data is divided into a partitions, this method of generating redundant data requires a read operations to access previously stored data, α write operations to store the updated write data, one read operation to retrieve the previous redundant data, and one write operation to store the updated redundant data. To generate and store redundant data in this example, (α+1) read operations and (α+1) write operations are required, totaling (2α+2) input and output operations. If the use of redundant data is unnecessary, only α (write) operations would be required to store the data. This means that, using this method of generating redundant data, an additional (α+2) input and output operations are required.
The second method for generating redundant data, referred to herein as the “method of all stripes,” generates redundant data from partitioned write data received from the high level device and from previous data read from the storage devices. In this method, however, the devices which store redundant data do not read previously-stored redundant data.
With this method, assume again that the write data is divided into a partitions. Assume also that the number of storage devices, except those for saving redundant data, is β, and further that α<=β. In this method, then, the total number of input operations to and output operations from the storage devices is (β+1), wherein the number of input operations is (β−α) and the number of output operations, including those containing redundant data, is α+1. If redundant data is not necessary, only α (write) operations would be required to store the data. This means that the additional number of input and output operations required is (β+1−α), when redundant data is generated by this second “method of all stripes.”
Apart from the foregoing methods, a method for generating redundant data in storage devices has been disclosed in U.S. Pat. No. 5,613,088, wherein the storage device uses two heads for simultaneous reading and writing. Specifically, the read head and the write head are fixed on a common actuator. During the data update process, the read head reads existing parity data and then the write head follows, updating the same area with new parity data generated, in part, from the old parity data.
The foregoing two methods for generating redundant data, that is, the “method of read and modify” and the “method of all stripes,” increase the number of input and output (I/O) operations associated with storing data on a disk. This means that the disk control device with redundancy is inferior in performance to the disk control device without redundancy. Hence, according to the present invention, a conventional disk control device with redundancy is made to selectively employ the redundant data generation method that results in a smaller number of I/O operations to and from the storage device. This selection makes it possible to reduce the burden on the storage device and thereby improve the processing speed. Specifically, in the case of (α>=(β−1)/2), the “method of all stripes” will use a smaller number of storage device I/O operations than the “method of read and modify,” while in the case of (α<(β−1)/2), the “method of read and modify” will use a smaller number. Therefore, if the length of the write data received from the high level device is in the range of (α<(β−1)/2), for example, in the case of a short transaction process, a disk control device that is configured to use the present invention will select the “method of read and modify” to generate redundant data.
The number of I/O operations using the “method of read and modify” is minimized at four (4) when α=1. This means that when α=1, performance cannot be improved further unless the method of processing is reconsidered. The problem with the “method of read and modify” is essentially based on the fact that two I/O operations must be issued to the storage device for each partition of data. With each I/O operation there is significant overhead associated with such factors as movement of the head and rotational latency. This mechanical overhead is a great bottleneck on disk control devices.
The method disclosed in U.S. Pat. No. 5,613,088 makes it possible to generate redundant data in a storage device configured with a read head and a write head mounted on a single actuator. Expanding this method to a general storage device provided with a single read-write head, the resulting method, referred to herein as the method of “generation in a drive,” employs the following steps. First, the data to be written to the disk drive device (the “write data”) and the existing data that will eventually be updated with the “write data” (the “data before update”) are transferred to the actual physical storage device that is responsible for generating and storing the redundant data. Within this redundant data physical storage device, the existing redundant data that will be updated (the “redundant data before update”) is read, and the new redundant data is generated from the combination of the “write data,” the “data before update,” and the “redundant data before update.” In this method of “generation in a drive,” the head is positioned to read the “redundant data before update,” and the updated redundant data is calculated, and when the disk reaches the next writing position, the write operation is started and the updated redundant data is stored. This operation makes it possible to avoid the spinning on standby that normally occurs during the interval between reading and writing, and merely requires one movement of the head and one standby spin. As a result, if the length of the data from the high level device is short, the processing speed of the control device can be improved further.
However, the method of “generation in a drive” cannot always calculate and store redundant data in the most efficient manner. If the length of the generated redundant data is longer than can be stored within one spin of the disk, the method of “generation in a drive” will require the disk to spin on standby during the interval between reading the “redundant data before update” and writing the updated redundant data. This additional spinning on standby increases the time required by the drive to save the updated redundant data, and thereby increases the response time of the disk array device. Therefore, if the length of the redundant data is longer than can be stored within one spin of the disk, the method of “generation in a drive” will have a greater response time than if the redundant data could be stored within one spin of the disk.
The method of “generation in a drive” is designed to increase the volume of “data before update” read from the component disk drives, as the number of partitions of “write data” received from the high level device increases, thereby increasing the load placed on the data storage device. Hence, if the number of partitions of “write data” is great, the method of “generation in a drive” disadvantageously lowers the throughput of the disk array device.
With the method of “generation in a drive,” it is possible to increase the amount of time the redundant data disk drive is busy during each spin of the disk, as compared to the “method of read and modify.” This increases the burden placed on the redundant data disk drive in a highly multiplexed and high load environment. Hence, the method of “generation in a drive” may enhance the probability that the redundant data disk drive will be in use, thereby lowering the throughput of the drive.
When write data is transferred from the high level device to the disk control device, together with an explicit specification of consecutive pieces of data, the method of “generation in a drive” operates to immediately generate redundant data for the transferred write data. As a result, when the succeeding write data is transferred from the high level device, the “method of all stripes” may lose the chance of generating redundant data corresponding to the first write data. Hence, if the method of “generation in a drive” cannot use the “method of all stripes,” this disadvantageously lowers the efficiency of generating redundant data, thereby degrading the throughput of the disk array device.
Finally, with the method of “generation in a drive,” the generation of redundant data may become unsuccessful upon the occurrence of any drive-related failure, such as the inability to read the “redundant data before update.” If this kind of failure occurs, the redundancy of the Error Correcting Code (ECC) group of the component disk drives of the disk array device may be lost at once.