1. Field of the Invention
The present invention generally relates to mass storage as used in computer systems. In particular, the present invention relates to verifying the integrity of data from a storage device in a computer system.
2. Description of the Related Art
Computer systems frequently store voluminous quantities of data on mass storage devices. A hard drive or a disk drive is one form of mass storage. Popular interface formats that are used for hard drives include the various versions of the small computer system interface (SCSI) and the AT attachment (ATA) interface standards.
Those in the art have sought to use low cost drives, such as ATA drives, in relatively high reliability applications to save cost. However, an end user of an off-the-shelf hard drive often has no convenient or timely way of identifying whether the drive selected is reliable or unreliable.
A disk drive typically includes an internal firmware driven controller, which can be prone to firmware bugs. One example of a firmware bug that results in corrupt data is a firmware bug in code that is responsible for caching hard disk data to a memory buffer. In addition, a drive can occasionally seek to an incorrect location on a hard disk platter. For example, when a host computer or controller requests data from logical block address A, the drive can unexpectedly return data from logical block address B instead of logical block address A.
In a conventional drive, the erroneous seek occurs without warning or indication and the host system is unaware that the drive has erroneously provided data from a wrong location. Although error checking protocols exist, the error checking in a conventional drive is limited to the verification of data transmitted from the drive to a host on an interconnect system, e.g., error checking within an ATA interface. Where the data in the drive is already corrupted by, for example, seeking to the wrong physical location, conventional error checking schemes may fail to detect the error.
One conventional approach to improve the reliability of a drive embeds error checking information in non-standard size sector. For example, one conventional approach uses special hard drives that store error checking information in two or more bytes of the non-standard size sector. A standard sector contains 512 bytes. By contrast, the special hard drives store error checking information in the extra two or more bytes of the larger than standard size sectors. A disadvantage to the special drives is a loss in economies of scale, as the special drives differ from standard off-the-shelf drives and are produced in much smaller quantities.
Embodiments of the present invention overcome the disadvantages of current systems by providing techniques that allow ordinary disk drives to verify that a seek to a track has been properly commanded by verifying that the desired sector has been accessed. The techniques can apply to single disk drives or to multiple disk drive systems such as in a redundant array of inexpensive disks (RAID). One embodiment maintains a reference to the logical block address of a cluster in an extra sector of the cluster, which allows the embodiment to verify that the seek had been properly executed.
Other embodiments according to the present invention advantageously maintain an error detection code, such as a Cyclic Redundancy Check (CRC) checksum, in an extra sector of the cluster that can be used to verify the integrity of the remainder of the data in the cluster. In yet another embodiment, both the reference to the logical block address and the error detection code are stored in the extra sector.
One embodiment of the present invention groups sectors in the disk drive into clusters of sectors. The cluster referenced herein is different than the cluster used in a file allocation table (FAT). The cluster of sectors according to an embodiment of the present invention includes multiple input/output data sectors and at least one xe2x80x9cextraxe2x80x9d sector. The extra sector maintains error checking information that can be used to verify the data in the data sectors, to verify that a read/write head has performed a seek to the correct track, and the like. The error checking information is recalculated upon extraction of the data from the cluster and compared with the previously stored calculation. In one example, a reference to a logical block address of a sector in the cluster is stored in the extra sector. In another example, the data verification portion of the error checking information conforms to a CRC-CCITT polynomial. The extra sectors occupy a portion of the storage space of the disk drive and the logical block addresses used by a host computer system are translated to new logical block addresses used by the disk drive. A number of sectors requested for transfer can also be translated to compensate for the sectors occupied by the extra sectors.
According to one embodiment of the present invention, to perform a write operation to the disk drive, the old data from the data group of the cluster disk drive is first read, then modified with the new data, and then written to the disk drive. The read-modify-write process allows a computation of the error checking information to be performed quickly and efficiently. A memory buffer can also be used to temporarily store the data to be written to the disk drive.
In another embodiment of the present invention, an indicator of a location of a cluster of sectors is stored in an extra sector of the disk drive. In one embodiment, the indicator corresponds to the logical block address (LBA) of the first data sector of the cluster of sectors. By maintaining a reference to the physical location of the accessed sector, the embodiment can detect whether the correct sector has been accessed. If an erroneous seek occurred, another seek can be commanded to the hard drive, by, for example, setting an interrupt to a firmware controller. In one embodiment, the other seek can include a command to move the read/write head to other tracks, a command to flush a memory cache, and the like.