The present invention is directed generally to the recording and retrieval of digital information on magnetic tape, and more particularly to methods and procedures for recovering from errors occurring during data transfer operations.
Conventional tape drive data storage apparatus employ various error correction and recovery methods to detect and correct data errors which, if left unresolved, would compromise the integrity of information read from or written to the magnetic tape media. Events which can lead to data errors include defects on the media, debris between the tape head and the media, and other conditions that interfere with head/media data transfer operations.
Error correction and recovery may be thought of as two distinct operations that are employed at different stages of error processing. Error correction is conventionally implemented using error correction coding (ECC) techniques in which random host data to be placed on a tape medium is encoded in a well-defined structure by introducing data-dependent redundancy information. The presence of data errors is detected when the encoded structure is disturbed. The errors are corrected by making minimal alternations to reestablish the structure. ECC error correction is usually implemented "on-the-fly" as data is processed by the tape drive apparatus. The well-known Reed-Solomon code is one cyclic encoding scheme which has been proposed for ECC error correction. Other encoding schemes are also known in the art.
Error recovery occurs when ECC error correction is unable to correct data errors. The error recovery process usually requires stopping the tape and reprocessing a data block in which an error was detected. Error recovery can include a variety of hardware and microcode configuration options. Microcode-based error recovery utilizes software routines executed by the tape drive microprocessor. Hardware-based error recovery utilizes hardwired logic components and systems controlled by microcode that adjust the hardware parameters. Typical microcode-controlled error recovery options include tape refresh operations wherein a tape is wound to its end and brought back to the error recovery point, tape backhitch or "shoeshine" operations wherein a tape is drawn back and forth across the tape head, and backward tape read operations, to name a few. Typical hardware-controlled error recovery options include tape tension adjustments, tape servo adjustments, track voting threshold adjustments, track synchronization adjustments, ECC pointer realignment, and changes to block and inter-block gap detection thresholds.
Error recovery often involves implementation of a complex series of hardware and microcode configuration options that are designed to be used in an optimum sequence of retry attempts. When the number of options increases, the number of retry attempts increases in accordance with the newly created configuration permutations. When the total number of error recovery retries exceeds a reasonable maximum, a decision must often be made to eliminate or reduce recovery scenarios. In prior art error recovery systems, those decisions are made at the design level where it is difficult to assess what benefit overlooked options might have provided. The advantages of the unimplemented options are thereby lost.
As to error recovery options that are implemented at the design level, one or more of such recovery options may be effective for some error recovery conditions but not others. For example, errors caused by localized tape defects or track fading due to debris adhering to either the tape media or the read/write heads can affect tracks for long stretches of tape as debris is dragged along. Errors of this type can often be resolved by reversing tape motion and dislodging the debris. In that case, other error recovery procedures may be unnecessary but will be performed in any event in accordance with the preprogrammed sequence, resulting in processing delays.
Accordingly, there is a need in the art for a system and method for recording and retrieving digital information on a tape wherein data error conditions are resolved in an efficient manner. Rather than perform a preprogrammed sequence of error recovery procedures, it would be desirable to tailor error recovery to the immediate cause of the data error while maintaining an ability to apply additional error recovery configuration options as required. This would provide flexibility in tactical decisions while pursuing an overall recovery strategy.