Certain disk drive storage systems are configured in a log-structured manner wherein data is recorded in a compressed record format. A log structured disk controller does not perform data record writes in place, but instead writes each data record to a new disk location that was previously empty. Thus, each write or update of data causes the data to be written to new physical locations. The previous physical locations of the data are subsequently "garbage" collected and reused for future writes. In a log structured disk controller, a directory is maintained to map the addresses used by the system to the physical addresses at which the data is actually stored.
In general, when reading a record from a disk drive, wherein each track includes compressed data records in a log-structured system, the entire track is read, the addressed record is accessed, decompressed and buffered for use by a host processor. In the decompression/selection action, a CRC character that was originally appended to the record is retrieved and a CRC character is calculated from the decompressed record. If the two CRC values match, the procedure continues with the knowledge that the record data was not corrupted.
In general, there is no CRC value stored which corresponds to a full track of data. Accordingly, log-structured disk controllers are generally set to only check for individual record CRC values.
In order to maintain data integrity in the event of a malfunction, the prior art includes a number of methods for enabling data recovery. One such method is termed "mirroring" wherein a second copy of updated data is copied to a backup disk. Many installations use on-the-fly creation of backup copies for critical databases. The backup copies are often physically removed from the primary disk drive. This process is also referred to as "extended distance dual copy".
The implementation of a mirroring system, such as an extended distance dual copy, requires substantial amounts of data communication between the primary and backup disk drives, even when the data on the primary disk drive is stored in compressed form. For instance, in U.S. Pat. No. 5,630,092 to Carreiro et al., assigned to the same Assignee as this application, a system and method are described wherein data records are mirrored between first and second disk drive systems. In both of the disk drive systems, the data is stored in compressed format and is transferred in compressed format therebetween. However, to accommodate a situation wherein compression actually creates an increase in the size of a data record, certain data is maintained in non-compressed form. To enable identification of the compressed/non-compressed states of the individual data records, meta-data values are attached to the records which indicate their compressed/non-compressed state. Thus, Carreiro et al. are able implement a mirrored disk drive system wherein backup data transfer times are minimized through the transfer of minimal size data records.
When performing a mirroring action, such as taught by Carriero et al., a data integrity issue is raised due to the fact that full tracks are accessed and transferred without decompression. Accordingly, there is no opportunity to check the individual CRC values stored with the compressed records or to otherwise check the integrity of the overall track during the succeeding transfers between mirroring processing actions.
Accordingly, it is an object of this invention to provide an improved method and apparatus for assuring data integrity during a mirroring operation.
It is another object of this invention to provide a method and apparatus which enables data integrity checks when compressed tracks are handled during a mirroring transfer.