1. Field of the Invention
This invention relates to efficient management and storage of data in a RAID disk array device or a RAID disk array in a computing system. More particularly, this invention relates to optimization of invalidation of data and parity information in a prewrite area of a RAID disk array.
2. Description of the Background
In computing systems designed for large data processing and data storage applications, redundant storage devices are provided to enhance the integrity of data maintained on the system in the event of a failure of a storage device.
For example, RAID (Redundant Array of Independent Disks) technology, such as RAID-1, RAID-4, and RAID-5, utilizes an array of disk drives which contain data and parity information distributed across each disk in the array. The parity information is additional information stored on the disks and can be used to reconstruct data contained on any of the drives of the array in the event of a single drive failure. In this manner, these RAID disk arrays can improve the data integrity of the computing system by providing for data recovery despite the failure of a single disk drive. However, because of the redundancy of information stored in the device, these RAID devices have been characterized by slow processing times for a single logical "write" of data to the RAID device.
RAID architectures can include a RAID device which is a standalone self-contained storage unit having multiple disk drives included therein arranged in a RAID array. The RAID information processing is performed internally to the device and is transparent to the computing system attached thereto. Alternatively, a computing system may have an array of disks and perform the RAID information processing within the processor of the computing system. Throughout this application, these architectures are referred to interchangeably, and the terms RAID device and RAID disk array are used interchangeably.
Regardless of the RAID architecture employed, data and parity information must be synchronously maintained in order to prevent corruption of data. There is a chance that parity and data for a region of a disk may get out of synchronization due to a system failure or crash. When this happens there is no indication of the problem until a disk drive fails, and the data returned on reads and writes from the RAID device will be incorrect.
In order to keep parity and data synchronized at all times, all write operations can be first placed in a "prewrite" area, having numerous prewrite slots, for temporary persistent storage, and then written to the actual logical blocks of the disk. This guarantees that if the host computer fails or crashes, or if the RAID device crashes, the data and parity can be kept in synchronization. The prewrite process uses the following steps:
1) write the data and parity to a prewrite area; PA1 2) write the data and parity to actual logical blocks of the disks; PA1 3) invalidate the data and parity in the prewrite area.
The "invalidation" step three is required to prevent data in the prewrite area from being erroneously backed up and corrupted by a system crash. Invalidation is defined as marking the prewrite data/parity as invalid or non-usable, preventing the information from being replayed upon initialization of the RAID array. Invalidation is performed after the parity and data stored in the prewrite slots have been physically written to their proper physical location on the RAID disk array. For example, a tag can be placed over each prewrite slot after the data/parity has been written to the disk indicating that the data/parity in the prewrite area is no longer valid and should not be used.
This unconditional invalidation step is expensive in time and performance, as it requires a separate disk write operation. The performance cost can be up to approximately 10 milliseconds per logical write operation to the RAID disk array.
FIG. 1 shows the steps disclosed in the co-pending application, "HOST-BASED RAID-5 AND NV-RAM INTEGRATION", referenced above, for performing a single "logical" write of new data in a RAID-5 device.
Operation 20 reads the old data from the disk, while operation 22 reads the old parity from the disk. Operations 20 and 22 are needed to calculate the new parity information. Operation 24 generates the new parity information by first removing the old data from the parity information, which can be achieved by an exclusive-OR operation. The new parity information is then generated by including the new data into the parity information, which can also be achieved using an exclusive-OR calculation.
Having calculated the new parity information corresponding to the new data, operation 26 records or "prewrites" the new data and the new parity to a prewrite region of the disk. In this manner, if the computing system is interrupted or if a single disk in the RAID array fails before the new data and new parity are both completed written to the disk, the new parity/new data information will always be synchronized. As previously explained, synchronization between data and parity is needed to correctly reconstruct data stored on a failed disk drive.
Having permanently recorded the new data and new parity in the prewrite area of the disk, this information can now be transferred to the respective storage locations on the disk drives. Operation 28 writes the new data to the disk, and operation 30 writes the new parity information to the disk. In this manner, both the new data and the new parity are now synchronously maintained on the disk drive.
Operation 32 marks the logical write operation to the RAID device as complete. This operation would include invalidating the data and parity information stored by operation 26 in the prewrite area of the disk. Upon a system failure, the data and parity information which are stored in the prewrite area can be used to restore data if that prewrite data/parity has not been marked invalid.
The invalidation step requires two write operations--one write operation to mark the prewrite data as invalid, and one write operation to mark the prewrite parity as invalid. This is in addition to the six disk input/output operations previously described. Hence, one logical write of new data to the RAID device would require eight physical disk input/output operations to the RAID device, a costly process.
What is needed is a device and method which is capable of minimizing the number of invalidating write operations while simultaneously ensuring the synchronization between parity and data on the RAID device.