1. Field of the Invention
This invention relates generally to recording and retrieving data on storage media and, more particularly, to accurately recording and retrieving data over defects in storage media.
2. Description of the Related Art
Track-oriented magnetic digital data storage media, such as magnetic tape and magnetic disk, comprise a magnetic coating applied to a substrate. As the magnetic storage media moves relative to read/write heads of a storage device, data can be magnetically recorded and read over the media in tracks that are divided into blocks or cells. Such storage media typically have surface defects of varying sizes that can create errors in the recording and reading of data. Surface assessment techniques are used to identify the location and extent of the defect areas and defect management systems ensure the correct recording and reading of data despite the defects. Conventionally, one of two defect management techniques are employed depending on the size of the defect and the resulting error. Small defect areas produce small errors that are corrected using error correction code (ECC) processing. Larger defect areas are skipped over by labeling an entire recording area as defective and moving the storage media relative to the read/write heads or by writing a gap code over the defect area.
The ECC method comprises a continuous processing of a data field as it is written and appending a redundancy field to the data, the value of which is determined by the data content. When the data field is retrieved it is processed in a similar manner, together with the redundancy field, to generate error syndromes. An error control code with a two byte redundancy field, for example, may be capable of reliably detecting up to two bytes and correcting one byte in error in a thirty-two byte data field. Codes of this type are commonly interleaved to provide protection against errors spanning several bytes. If the received data field is held in a small first-in-first-out buffer, then many errors in the data can be corrected with only a small delay to the first data field. These methods provide a convenient means for a storage subsystem to manage small media defects or other data transmission problems that result in short error bursts spanning only a few bytes. When this type of correction is possible, the data flow from the storage .device is not interrupted.
The larger defect areas cause recording and reading errors too large to be corrected by ECC processing and therefore are skipped by designating an entire recording block or cell as defective. A defective block is skipped by moving the read/write head relative to the storage media to a designated alternate, spare block area of the media or by writing a gap field to move the head relative to the defect area as described, for example, in U.S. Pat. No. 3,997,876 to Frush. In either case, the read/write head of a storage device is moved relative to the storage media and the reading and recording of data is suspended so that the defect area is skipped by the head without reading and recording of data and without ECC processing. To increase the speed with which the large defects can be skipped, all skip operations typically are of the same size. Thus, for every defect too large to be corrected by ECC processing, the extent of storage media skipped will be the same size regardless of the defect size.
In general, the size of the skips is dependent on the data format of the storage device. For example, in a fixed-block architecture storage device, blocks are defined in tracks across the surface of the storage media in blocks of a predetermined size comprising hundreds, or even thousands of bytes per block. The blocks are filled in with data during recording operations. A defect-containing area of the storage media is skipped by Skipping the entire data block in which the data otherwise would be recorded. Instead, the data is recorded in a spare block of the storage media that is reserved for just such purposes. In a count-key data storage device, data is recorded on the storage media in tracks comprised of cells somewhat smaller than a typical data block. Each track includes a header field that identifies the track number and an offset value that specifies the number of cells to any defect area in the track. At a defect area, writing is suspended and storage device clock synchronization is lost, a predetermined number of cells are skipped, clock synchronization is acquired again, and writing is resumed. Subsequent data blocks are shifted down by the number of cells skipped, using a reserved spare space at the end of a track. In either block skipping method for defects too large for ECC processing, the size of the storage media area skipped is predetermined using statistical information concerning the expected size and number of defects for the particular storage media and must encompass an area no smaller than the number of bytes that could be handled by ECC processing but larger than the largest expected defect area.
Both block skipping methods can be inefficient if the size of the defect area being skipped is small relative to the size of the storage media area being skipped. For example, a defect that is relatively small, although still too large for correction by ECC processing, consumes an entire recording block or cell. A block or cell can be, for example, 4K bytes of disk space. A defect too large for correction by ECC processing can be on the order of less than 100 bytes. Thus, over 3K bytes of storage media surface area is commonly wasted by block skipping methods.
Because storage media defects can arise after an initial surface assessment and at any time during the useful life of a magnetic storage media, block skipping methods must reserve space for future defects by allocating spare block areas even in the absence of detected defects. This creates relatively large amounts of unused storage media area. Such overhead increases the inefficiency of block skipping methods. Finally, many disk array subsystems incorporate multiple devices that transfer data in parallel. Block skipping interrupts such parallel data transfer, requiring complex control mechanisms to restore parallelism or minimize the interruption.
From the discussion above it is apparent that there is a need for a defect management system that can process storage media defects that are too large to be handled by ECC processing, without allocating large amounts of storage media space to skip these relatively small defects. The present invention satisfies this need.