A magnetic disk drive is a device for storing data. In a magnetic disk drive, data is written to and read from a magnetic disk surface. To read or write data, a head, i.e., a signal transducer, is positioned at a desired track. Two types of operations are conventionally provided in a disk drive to deal with errors occurring during reading and writing: an error recovery procedure, and a data reassign operation. An error recovery procedure (ERP) is conventionally known as an operation for recovering from an error which occurred while reading or writing data. The data reassign operation moves data from a part of the disk determined to be defective to a predetermined alternate area on the disk.
An error can occur in the disk drive due to a flaw in or unevenness of the magnetic material which occurs in the manufacturing process. Errors can also occur due to the change of the magnetic material over time. If an error occurs in the data stored on the disk, various error recovery processes may be executed. The error recovery processes include error recovery by ECC, changing the read gain, changing the offtrack value, and changing the bias value of the Magneto-Resistive (MR) element for reading. After executing the error recovery processes, a read operation is executed again. If the error persists even after executing a plurality of error recovery processes, then the error is an unrecoverable data error or a "hard error". Data from a defective area of the disk may be reassigned to a new location as part of the error recovery process. Data is reassigned by rewriting the data that is recorded in the area of the disk in which the error occurred to another predetermined area on the disk, even if the error recovery process was successful.
Recently, MR heads have become widely used in disk drives. An MR head has an MR element and the output resistance of the MR element changes with the changes in the magnetic field. To read data, the change in resistance is converted to a d.c. voltage by supplying a predetermined current through the MR element.
However, it was found that a new type of error can occur in disk drives using MR heads. This error is due to thermal asperity. A thermal asperity is a protrusion which appears on the surface of the disk. The protrusion collides with the MR head, causes the resistance of the MR element to change due to a temperature change in the MR element, and thereby generates an abnormal signal.
As a countermeasure against the error from thermal asperity, one method reduces the rotational speed of the disk to decrease the flying height of the magnetic head. By reducing the flying height of the head, the protrusion on the disk can be cut and the cause of the thermal asperity can be removed. This countermeasure is called low spin burnish and is also incorporated as one of the steps in an error recovery procedure (ERP).
As described above, various recovery means can be used to recover from errors which occur while reading or writing data. The recovery means are usually stored as a series of continuous steps in the ERP and are executed sequentially by a command from the system.
If an error occurs while reading or writing data, the hard disk drive (HDD) executes the ERP. Error recovery is attempted by executing one or more error recovery steps in the ERP, e.g., changing some standard reading conditions. For example, the reading conditions may include: an offtrack amount, i.e., the amount of deviation between the center of the track and the center of the magnetic head; a value of the bias current provided to the MR element if the magnetic head has an MR element; and parameters of the PLL circuit to stabilize the sampling frequency.
The individual steps in the ERP are executed sequentially, and a retry, i.e. an attempt to reread the data, is carried out at the end of each step. If the retry succeeds, the ERP ends. If the retry fails, the ERP terminates when a preset maximum number of retries is reached or when the last step of the ERP is completed.
Along with the series of error recovery steps, another countermeasure performs a data reassign operation. The data reassign operation writes the data stored in the sector-in-error to another position on the disk, i.e., an alternate or "spare" area. Subsequently, the sector-in-error is no longer used because the probability that similar errors will successively re-occur in the sector-in-error is high, and will thereby cause an unrecoverable hard error. The alternate area was previously reserved on the disk with a predetermined capacity. If an error is recognized as an unrecoverable error, the data stored in the sector-in-error is rewritten to the alternate area of the disk.
In a conventional disk drive system, a data reassign operation is executed under the following conditions, for example:
a) when a hard error is identified while writing data;
b) when an error occurred while reading data, and the error recovery is followed by a particular predetermined error recovery step; and
c) when a hard error is identified while reading data, data is then written to the same sector where the error occurred, and a similar error reappears as a result of verifying the written data.
In a conventional disk drive system, if any of the above conditions are met, the data reassign operation is performed automatically to move and write the data stored in the sector-in-error to an alternate area of the disk.
However, because the capacity of the alternate area is limited, the amount of data which can be reassigned is limited. If the data reassign operation is performed frequently, the alternate area will quickly become full. Therefore, for subsequent errors, using the data reassign operation for error recovery becomes impossible. Alternatively, if the conditions for executing the data reassign operation are restricted, the frequency of errors becomes high and the amount of time required to execute the ERP to recover from errors increases. Therefore, system performance degrades. Errors from thermal asperity are unique to MR heads and may occur after the disk drive has been used for a certain period of time. Therefore, there is a need to reserve capacity in the alternate area on the disk to handle future errors.