The present invention relates to magnetic disk drives of the type used for information storage in electronic computer systems, including personal computers and work stations, and especially to an automatic reassignment function that improves the reliability of storage data.
Generally, with the increase in the storage capacity of a data storage device that operates as peripheral equipment in a computer system, the reliability of stored data has become more and more important.
When computer systems read data from or write data to a unit on a data storage media, a read unable error or a write unable error often occurs for some unknown reason. The result of these kinds of errors is that data cannot be read out in spite of the receipt of a read instruction from an upper system, or data is found to be abnormally written when reading the data after a write operation that responds to a write instruction from the system.
In these cases, as one of the methods of improving the data reliability in a data storage, an automatic reassignment scheme has been adopted. Here, an automatic reassignment is defined as a technique in which a copy of a recorded unit of data (for example, sectors on a track on magnetic disk media in a magnetic disk drive) is prepared beforehand in a data storage, separately from the set of the recording unit of data, and the copy is used instead of the original, if necessary.
The probability of occurrence of data errors in magnetic disk drives of the type that are used as external memory devices is higher than that of conventional drives, since their recording densities are much higher and their total storage capacities have become much larger than ever. Therefore, in the magnetic disk drives for personal computers, an automatic reassignment function for sectors having defects (defective sectors) is generally adopted. And, if write errors occur, the data will be written to an alternative area. When read errors occur, by watching the number of retries and writing the data to the alternative area, the reliability of the data is improved.
It is important for any automatic reassignment technique to determine when to start execution of a read reassignment process with a suitable timing and how to secure the data integrity. Many different techniques have been proposed for these purposes. For example, as disclosed in a Japanese unexamined patent publication Hei 6-75717, entitled xe2x80x9cRead error recovery system for a hard disk drivexe2x80x9d, a subtle or fine displacement between a magnetic head and a magnetic disk media is avoided by executing re-reading and writing of data to the same destination area, when a read error occurs, if the data is read normally before a number of the re-reading operations reaches a predetermined number. If the number of re-reading operations is equal to the predetermined number or more, an attempt is made to improve the reliability of the read data by executing the reassignment process to another area (to store the data to an alternative area). Hei 6-75717 discloses a system in which a re-read is executed when a read error occurs; and, if the data is read correctly before the number of re-reads reaches a predetermined number, the system writes the data to the same area.
If a subtle or fine displacement between a magnetic head and a magnetic disk media is the cause of an error, an execution of repeated data write operations may be effective. But if an error is caused by the existence of an infinitesimal bad spot (an infinitesimal defective region) on a sector, that is, by media defects, there is a tendency to repeat the retry of a read operation even though the write operation has been performed normally.
The present invention applies an automatic reassignment technique, especially at the time of a read operation, to a magnetic disk drive and performs the following controls.
The present invention proposes to determine a symptom of a unit that handles data, which symptom is the cause of a failure resulting in the occurrence of a read error on the unit, and the data of the unit is temporarily stored (back up process) in a temporary storing area (back up area). If the symptom of the failure is considered to represent a definite failure, the data of the unit is shifted to a safe storing area (alternation area). As a result, a fatal read error and a temporary read error are treated differently.
In other words, a priority level of candidates among data to be shifted to the alternation area is determined dynamically by referring to the number of retry operations related to the read errors. The unit of data is stored in the temporary storing area, which is a back up area, based on the state of the accumulated number of retries after the temporary storing, time stamps of data and so on.
The present invention does not renew directly the data content of an area, or a unit or a sector to handle data when a normal read is performed before the number of retries reaches a predetermined number, as in the conventional technology. In accordance with the present invention, under such circumstances, a data write command is not executed to the area, the unit, nor the sector to handle the data. Rather, the content of data is held in a back up area prepared in advance and is controlled using a back up data table. Concerning the unit that is judged to be definitely defective among the units of data held in the back up area, the content of the definitely-defective unit is transferred to the alternation area, and the reliability of stored data is improved even if a read error has occurred.
In the magnetic disk device that is provided with an automatic read reassignment function,
1) A back up area that holds the user data temporarily, an alternation area that serves as a substitute for the failed area or sectors, and a control area that stores control data are formed on a magnetic disk media in advance, in addition to the user data area that stores the data of the user (or the data of an electronic computer system) that uses the storage function of the magnetic disk device.
2) When a write command is accepted from a host (host processing unit or electronic computer system) to a sector where an error has occurred at the prior read processing, and the read processing is recovered by a retry processing, the data in the sector to be backed up (the sector that is read recovery processed) will be revised. At this time, the data is written (a duplicated writing of the same data) to the original sector and the back up area (more specifically, the sector in which data is held in the back up area).
3) For an error that occurs during read processing to a sector, a function to judge whether the sector has the symptom of a failure or not is provided and a back up process using a back up area is performed. That is, the back up process has a threshold value in the retry process in the read operation, and if the read operation is performed normally after a number of retries exceeding the threshold, the data of the sector is held in the back up area.
4) When the original sector can not be read, the data at the back up sector is used.
5) The system has a function for judging that a sector is defective when an error has occurred in the process of a read operation for the sector and, in such case, for executing a reassignment process of the data to an alternation area. That is, in the retry processing of a read operation, a threshold value for the number of retries is provided, and when the number of the retries exceeds the threshold value, the sector is judged to be defective.
6) An alternation processing is controlled based on a reassignment data table including a reassigned address recording part that registers a defective sector address, and a reassigning address recording part that registers the address of the alternative storing sector which is used for the alternation of the defective sector.
7) The back up process is controlled by a back up data table having a backed up address recording part that registers addresses of sectors with the symptom of defects, a back up address recording part that registers temporary addresses of the sectors that store the data of the sectors with the symptom as a temporary storage, an accumulated number of retries recording part that records an accumulated number of retries, and a priority order recording part that controls a priority order of recording. Each registration in the back up data table is assigned a priority; for example, as the accumulated number of the retries is larger, the priority is higher. A newly found sector that has the symptom of a defect is registered without fail (the registering of defective sectors and other parameters are arbitrary to be registered) and the sectors accessed latest and with a larger accumulated number of retries are assigned a higher priority.