Data read from disk drives manifests both hard and soft errors. A hard error is one which often results from a mechanical fault in the system and is not correctable by error correction logic even after many rereads and error recovery procedures. A soft error is an intermittent recoverable error which does not, itself, affect overall reliability of the disk drive system. An occurrence of too many errors will exceed the ability of the disk drive's error correction logic to correct the errors. Disk drive products employ thin film inductive, MIG-ferrite, or magnetoresistive heads to read and write data. It is known that these transducers occasionally exhibit instabilities which affect data read from the disk. Further, such transducers may exhibit multiple response modes wherein a read back signal is significantly different in one mode than in another mode. Some of these modes result in signal distortion that is sufficient to cause hard or uncorrectable error states during the read operation.
Procedures have been developed over the years to allow recovery from such transducer instabilities. For instance, it is known that passing a write current through a write element of the transducer can often change the state of the transducer to a more desirable one, removing the cause of error generation. Most disk drives currently being shipped include a built-in error recovery procedure which includes "write correction" as part of the procedure. The write correction operation generally is performed after the failure of several attempts to reread and error correct the erroneous data.
Transducer-induced error states may also occur when a transducer is in a state which avoids hard errors, yet nevertheless produces a high soft error rate. Prior art disk drive systems monitor the data error rate and generate a system information message indicating that an uncorrectable error state has occurred if an error rate threshold is exceeded. In some applications, a file will be in a read only mode for long periods of time. If a transducer head is in a moderately unstable state that just produces a high rate of soft errors, which high rate does not produce an unrecoverable error state, a head "toggle" procedure will not be implemented and the high soft error rate will persist.
A "toggle" procedure will often improve the operation of a transducer, be it an MIG-ferrite, thin film inductive, or magnetoresistive transducer. For MIG-ferrite and thin film inductive transducers, the most common form of toggle procedure is to cause the transducer to perform a write operation in an area not over usable data and then to reread the written data. This action will normally change the magnetic structure of the transducer adequately to render a poor reading state into a better reading state. Sometimes more than one toggle procedure is required to improve the performance of the transducer.
The prior art is replete with error recovery procedures. U.S. Pat. No. 4,866,712 to Chao describes a multi-stage system recovery strategy which performs a re-initialization or re-setting operation at a lowest level, before a threshold is rendered which would necessitate replacement of a component. More specifically, an error table is provided with one entry for each possible error and contains a count increment for each corrective action that might be taken to correct the error. An error count threshold is provided for each possible corrective action. The Chao system operates to accumulate error count increments against possible actions and when a corresponding threshold is exceeded, initiates a corrective action. U.S. Pat. No. 4,993,029 to Galbraith et al. describes a system for randomizing data stored in a disk drive which avoids recording of data patterns that may stress the ability of the disk drive's error recovery circuitry to identify and correct read data errors. U.S. Pat. No. 4,922,491 to Coale describes a service alert function for disk drive subsystems when an error threshold has been reached. U.S. Pat. No. 5,090,014 to Polich et al. describes a system which logs errors and identifies when components, such as disk drives, are likely to fail. More specifically, an expert system retrieves error entries and processes them to determine whether a failure is likely to occur. U.S. Pat. No. 3,704,363 to Salmassy et al. describes a system for logging data errors in a disk drive. The log information provides an accumulated count of a total number of various types of usage, while error information provides an accumulated count of the total number of various types of errors counted during the usage.
U.S. Pat. No. 5,053,892 to Supino, Jr., et al. describes a disk drive transducer toggling procedure which, upon experiencing an uncorrectable error state, moves the transducer to a non-data containing area and subjects the transducer to a toggle operation. Such toggle operation is not performed if the number of errors in data read from a normal data track is within the capacity of the error correction circuitry to correct. Only when the error correction circuitry indicates an uncorrectable error state is the toggle operation commenced.
For magnetoresistive transducers, there are several ways to change the state of the transducer. Magnetoresistive transducers have separate write and read elements. Application of write potentials to the write element of a magnetoresistive transducer can cause the state of the magnetoresistive transducer to change. Magnetoresistive transducers require a bias current to work properly. The state of the transducer can often be changed by simply turning the bias on or off or selecting a slightly different bias current. All of the above techniques constitute a "toggle".
Accordingly, it is an object of this invention to provide an improved method and system for reducing errors in data read from a disk drive.
It is a further object of this invention to provide a system and method for reducing soft errors experienced during operation of a disk drive.
It is yet another object of this invention to provide a soft error correction procedure for a disk drive which comes into effect prior to the occurrence of an unrecoverable error state.