In conventional recording of data to tape media, the smallest unit written to tape is the Data Set. The Data Set contains two types of data: user data and administrative information about the Data Set, the latter being in the Data Set Information Table (DSIT). All data is protected by an error correction code (ECC) to minimize data loss due to errors or defects. The Data Set comprises a number of Sub Data Sets, each containing data arranged in rows. A Sub Data Set row may contain user data or contain the DSIT. As illustrated in FIG. 1, each row consists of two interleaved byte sequences. A first level ECC (C1 ECC) is computed separately for the even bytes and for the odd bytes for each row. The resulting C1 ECC even and odd parity bytes are appended to the corresponding row, also in an interleaved fashion. The ECC protected row is the Codeword Pair (CWP). The even bytes form the even C1 Codeword while the odd bytes form the odd C1 Codeword. A second level ECC (C2 ECC) is computed for each column and the resulting C2 ECC parity bytes are appended to the corresponding columns. The ECC protected column is a C2 Codeword.
The Sub Data Set, when so protected by C1 and C2 ECC, is the smallest ECC-protected unit written to tape. Each Sub Data Set is independent with respect to ECC; that is, errors in a Sub Data Set affect only that Sub Data Set. The power of any ECC algorithm depends upon the number of parity bytes and is stated in terms of its correction capability. For a given number of C1 ECC parity bytes computed for a C1 codeword, up to K1 errors may be corrected in that codeword. And, for a given number of C2 ECC parity bytes computed for a C2 codeword, up to K2 errors may be corrected in that C2 Codeword.
It will be appreciated that multiple errors in the same Sub Data Set can overwhelm the ability of the C1 or the C2 correction power to the extent that an error occurs when the data is read. Errors may be caused by very small events such as small particles or small media defects. Errors may also be caused by larger events such as scratches, tracking errors or mechanical causes.
To mitigate the possibility that a single large error will affect multiple Codewords in a single Sub Data Set, some methods of writing place Codewords from each Sub Data Set as far apart as possible along and across the tape surface. A single error would therefore have to affect multiple Codewords from the same Sub Data Set before the ECC correction capability is overwhelmed. Spatial separation of Codewords from the same Sub Data Set reduces the risk and is accomplished in the following manner for a multi-track recording format. For each track of a set of tracks being recorded simultaneously, a Codeword Quad (CQ) is formed by combining a Codeword Pair from one Sub Data Set with a Codeword Pair from a different Sub Data Set. The resulting CQ is written on one of the multiple recorded tracks. In like manner, CQs are formed for all remaining tracks by combining Codeword Pairs, all Codeword Pairs being from differing Sub Data Sets. The group of CQs written simultaneously is called a CQ Set.
For example, in a 16-track recording format, there are 16 CQs in a CQ Set, comprising 32 Codeword Pairs. If there were 64 Sub Data Sets in a Data Set, two CQ Sets could be recorded before it were necessary to record a second Codeword Pair from a given Sub Data Set. FIG. 2 illustrates a portion of the Data Set as recorded on tape. The shaded cells indicate the row number for the eight Codeword Pairs taken from the same Sub Data Set. The arrow 200 indicates longitudinal separation of the Codeword Paris along a track and the arrow 202 indicates transverse separation across tracks. As will be appreciated, a large defect would have to span multiple shaded cells in order to overwhelm the ECC in any one Sub Data Set.