In data storage devices and systems such as, for example, hard disk drives (HDD), the combination of poor write/read conditions and low signal-to-noise ratio (SNR) data detection is likely to cause a mixed error mode of long bursts of errors and random errors in data sectors stored on the disk. Typically, byte-alphabet, Reed-Solomon (RS) codes are used to format the stored sector data bytes into codewords protected by redundant check bytes and used to locate and correct the byte errors in the codewords.
Long codewords are more efficient for data protection against long bursts of errors as the redundant check byte overhead is averaged over a long data block. However, in data storage devices, long codewords cannot be used, unless a read-modify-write (RMW) process is used because the present logical unit data sector is 512 bytes long and the present computer host operating systems assume a 512-byte long sector logical unit.
Each RMW process causes a loss of a revolution of the data storage medium. Losing revolutions of the data storage medium lowers input/output (I/O) command throughput. Therefore, frequent usage of the RMW process becomes prohibitive because it lowers I/O command throughput.
The combination of low SNR detection and poor write/read conditions may result in both random errors as well as long bursts of byte errors (“mixed error mode”) becoming more and more likely at high areal densities and low flying heights, which is the trend in HDD industry. The occurrence of such mixed error mode combinations of random as well as burst errors is likely to cause the 512-byte sector interleaved OTF ECC to fail resulting in a more frequent use of a data recovery procedure (DRP) which involves rereads, moving the head, etc.
These DRP operations result in the loss of disk revolutions that causes a lower input/output (I/O) throughput. This performance loss is not acceptable in many applications such as audio-visual (AV) data transfer, for example, which will not tolerate frequent interruptions of video data streams. On the other hand, uniform protection of all single sectors against both random as well as burst errors, at the 512-byte logical unit sector format, would result in excessive and unacceptable check byte overheads. Such check byte overheads also increase the soft error rate due to the increase in linear density of the data.
Long block data ECC, such as 4 K byte physical block comprising eight sectors, for example, could be a solution for some applications, but it would require a change in the operating system standard, unless read-modify-write (RMW) is accepted when writing single 512-byte sectors. Present operating systems are all based on a 512-byte long sector logical unit. RMW is required to update the long physical block check bytes. Thus, when a single 512-byte sector is written, the other sectors in the long block need to be read, the long block check bytes need to be recalculated, and the whole long block is then rewritten. Hence, the RMW causes an I/O throughput performance loss that is generally unacceptable for typical HDD operation.
Therefore, it would be desirable to have an ECC format for a data storage device that has a low sector failure rate for the mixed error mode of random error and burst error, that avoids frequent DRP or RMW use, and that also has an acceptable check byte overhead. Accordingly, there is a need for a multiple level (ML), integrated sector format (ISF), error correction code (ECC) encoding and decoding process for data storage devices and systems or communication devices and systems. A system and associated method that satisfy this need have been disclosed in U.S. patent application Ser. No. 10/040,115, supra.
A problem that is specifically associated with the present invention is the possibility of complete loss of a part, or the whole, of for example one 8-sector data cluster within a cluster block. As an illustration, the cluster block can be comprised of 16 data clusters. The problem with replacing an unreadable or erased sector (or sectors) with a parity sector in a parity cluster, such as an on-drive Raid-5 system, is the lack of verification that the parity sectors are consistent with the data clusters in the cluster block. What is therefore still needed, is a function capable of checking the reliability of parity sector correction across a cluster block against miscorrection.
Another specific problem relates to the odd-boundary-write operation. Odd-boundary-write operation is a write operation that does not begin at the first Logical Block Address (LBA) of a physical 8-sector data cluster, or which does not end at the last LBA of a physical 8-sector data cluster. The completion of a second and a third level (C2/C3) encoding for an 8-sector data cluster, in the presence of an odd-boundary-write operation requires a Read-Modify-Write (RMW) operation.
For sequential writes, the frequency at which the Read-Modify-Write operation would be required is low, and the C2/C3 protection completion may thus not pose a considerable problem. However, for random writes the C2/C3 protection completion may not be acceptable from a performance viewpoint. There is therefore a need for avoiding the RMW operation, while concurrently completing the C2/C3-protection in an 8-sector data cluster, whose ISF Protection is fragmented by odd-boundary-write.
Yet another specific problem is the occurrence of a data erasure, a “jami,” that can wipe out a sector written inside a data cluster, for which a C3-encoding has been completed. There is therefore a need to introduce a readability state of sectors within a data cluster, which is a byte that is “virtual.” The readability state of sectors should not be actually written on the disk, but should be encoded into the C3-checks. The readability state of sectors should be updated and re-encoded into the C3-checks during the drive scrub operation, so that these C3-checks serve as miscorrection checks (i.e., CRC) for a higher level protection.