1. Field of Invention
This invention relates to data storage systems.
2. Related Art
Many computer applications need to store and retrieve information. Information can be stored on hard disks, floppy disks, CD-ROMs, semiconductor RAM memory and similar storage devices. Many of these storage systems are susceptible to data loss of various forms including disk failures. A solution to the problem of disk failure involves use of a RAID (redundant array of independent disks) system. One style of RAID systems uses multiple hard drives to store parity data generated from the data drives, either on a separate drive (known as the parity disk) or spread out among the multiple drives. The use of multiple hard drives makes it possible to replace faulty hard drives without going off-line; data contained on a drive can be rebuilt using the other data disks and the redundant data contained in the parity disk. If a hard drive fails, a new hard drive can be inserted by “hot-swapping” drives while on-line. The RAID system can rebuild the data on the new disk using the remaining data disks and the redundant data of the parity disk. The performance of a RAID system is improved by disk striping, which interleaves bytes or groups of bytes across multiple drives, so more than one disk is reading and writing simultaneously. Files are broken into chunks of data known as file blocks and these file blocks are stored in one or more physical sectors of one or more hard disks. Each file block is a given size such as 4,096-bytes that takes up 8 sectors.
A first known problem with storage devices is that they are susceptible to data corruption. This data corruption includes bit flips, misdirected I/O, lost I/O, sector shifts, and block shifts. One style of RAID uses parity data to determine whether there has been corruption of some data included in a disk stripe. Parity is checked by comparing the parity value stored on disk against the parity values computed in memory. Parity is computed by taking the exclusive-OR (henceforth “XOR”) of the blocks in the data stripe. If the stored and computed values of parity are not the same, the data may be corrupt. If a single disk block is incorrect, the RAID system includes enough data to restore the corrupted block by recalculating the corrupted data using the parity data and the remaining data in the data stripe. However, such RAID systems can not determine which disk includes the corrupt data from parity values alone. Although parity data is useful in determining whether corruption has occurred, it does not include enough information to restore the corrupted data. Moreover, it is unclear which data has been corrupted.
Checksums are another form of redundant data that can be written to individual disks. The combination of parity bits across the disks along with checksums and their associated information may include enough information so that the corrupted data can be restored in RAID and other redundant systems.
A second known problem involves using a sector checksum for each sector of data. A sector checksum is generated for each collection of data that can fill a sector. The data is stored in a disk sector, along with the associated sector checksum. Some known systems include reformatting a collection of hard disks from standard sector sizes such as 512-byte sectors to include sector checksums in each sector such as reformatting to 520-byte sectors. Data corruption in the disk sector can then be detected by using the sector checksum because the stored checksum would not match a computed checksum. However, data corruption such as sector slides, misdirected reads and writes, and lost sectors would not be detected at the disk sector level. For this type of corruption, a checksum computed from the sector data would match the stored checksum.
A third known problem is storing checksums in reserved locations separate from the associated data. A separate read or write operation of the checksum is required for every read or write operation of the associated data. This can result in performance loss in some workloads.
Accordingly, it would be advantageous to provide an improved technique for the error checking and correction of data storage systems. This is achieved in an embodiment of the invention that is not subject to the drawbacks of the related art.