One particular disk array system which provides a high degree of availability of the disks thereof is often referred to as a Redundant Array of Inexpensive Disks (RAID). Such system uses an intelligent input-output (I/O) processor for accessing one or more data storage disks of the array in response to the needs of a host computer, each data storage disk of the array being driven by disk drive circuitry operating via the I/O control processor in a manner that effectively appears to the host computer as a single disk drive. A data storage disk module comprises, for example, a disk, disk driver circuitry, and power/control circuitry. Alternatively, in some implementations of such systems, an I/O processor need not be used and the host computer may communicate directly with the disk modules which form an array.
In a particular RAID-5 context, for example, which comprises an array of five data storage disks, each disk has a plurality of data storage sectors, corresponding sectors in each of the five disks being referred to as a sector group or "stripe" of sectors. Each stripe includes one sector for holding redundant, or parity, data. The remaining sectors in the stripe store user data. The use of such redundancy allows for the reconstruction of user data in the event of a failure of a user data sector in the stripe.
When a user data disk module fails, the redundant or parity entry that is available in the parity sector of a stripe and the data in the non-failed user data sectors of the stripe can be used to permit the user data that was in the sector of the failed disk to be effectively reconstructed so that the system can remain operative using such reconstructed data even when the user data of that sector of the failed disk cannot be accessed.
Certain kinds of failures, however, can occur in which the array is left in an incoherent or effectively unusable state, e.g., a situation can occur in which there is a power failure, i.e., power to the I/O processor (IOP) fails or the I/O processor itself fails due to a hardware defect, or power to the disk drives themselves fails. A further problem can arise, for example, if a power failure results in the need to use a new IOP to replace a failed one and there is no way to identify where a write operation to a sector of the array was taking place after the new IOP has replaced the old IOP.
Techniques have been devised for handling such power failure situations that cannot be handled by RAID-5 systems as originally designed. Using conventional disk drives to perform a RAID-5 style write typically requires several distinct commands and associated data transfers. A RAID-5 write begins with a command to read old data from the target data drive. Then, new data is written to the data drive. A partial product is obtained by XORing the old and new data to get a partial product. A read command is issued to read the old parity from the disk drive containing the parity sector. The partial product is XORed with the old parity. Finally, the result of the XOR is written into the parity sector. If a write is interrupted by a failure, a mechanism is needed to identify where an interrupted write may have been started.
In order to handle this situation, Data General Corporation, the assignee of the present application, has used, in prior art systems, additional bytes at the end of a sector to include what have been called a checksum, a shed stamp, a time stamp and a write stamp. Typically a sector includes 512 bytes of host data and 8 bytes of validation data. The checksum verifies that the host data is correct and is on the correct disk sector. The shed stamp is a series of bits used to identify whether the parity sector of a particular sector group contains parity data or shed data. Shed data is used when the array is being operated in a degraded mode, in other words when one of the disk drives is not operating. Shed data is described in U.S. Pat. No. 5,305,326 (Solomon et al.), the disclosure of which is hereby incorporated by reference herein.
A time stamp is used to provide a check against data sector corruption from the last major update. Each time there is a major stripe update (i.e., a write command that causes all sectors in a stripe to be updated), the time stamp is set in each of the sectors that belongs to the particular stripe. The time stamp is a unique random number that is written into each of the sectors in the stripe at the time of the major stripe update. Thereafter, any change to the data in that sector causes the time stamp to be invalidated. The time stamp, thus, provides a validation mechanism for sectors that have not been updated since the last major stripe update.
The write stamp is a series of bits, one for each data storage disk in a sector group. During a major stripe update, the write stamps are all set to zero. Each time a write is performed, the bit corresponding to the disk drive being updated is flipped in the drive being updated. The corresponding bit is also flipped in the parity drive upon completion of the write. In order to flip the write stamp bit corresponding to the disk drive being written into, a pre-read must take place so as to know the old value. Only then is it known what the new value should be. The pre-read is performed anyway during a RAID-5 write into a conventional disk drive in order to retrieve the partial product for writing the parity sector. The write stamp is used to provide a validation mechanism for sectors that have been updated since the last major stripe update.
Data storage disk drives with the ability to coordinate parity updates through drive-based XORing will soon be available on the market. When such a disk drive is operated in the drive-based XOR mode, a host simply issues a XOR-WRITE command to the disk drive. This command causes the data drive to read the old data from the disk, write the new data to the disk, compute the partial product, issue a partial product XOR-WRITE to the parity drive and then return status to the host when the entire update is complete. Thus, the host only has to issue a single command. Parity is updated by the disk drives and there is no need for any read command. To implement the prior art write stamp validation mechanism on a disk drive performing drive-based XOR would result in slowing the system operation. Thus, it has become desirable to develop a new validation system that does not involve pre-reads as was the case for the write stamp.