The present invention is related to the subject matter of the following commonly assigned, copending United States patent and patent applications: U.S. Pat. No. 6,247,152 entitled xe2x80x9cRELOCATING UNRELIABLE DISK SECTORS WHEN ENCOUNTERING DISK DRIVE READ ERRORS WITH NOTIFICATION TO USER WHEN DATA IS BADxe2x80x9d and issued Jun. 12, 2001; Ser. No. 09/283,366 entitled xe2x80x9cABILITY TO DISTINGUISH TRUE DISK WRITE ERRORSxe2x80x9d and filed Mar. 31, 1999; and Ser. No. 09/282,873 entitled xe2x80x9cRELOCATING SECTORS WHEN DISK DRIVE DOES NOT RETURN DISK WRITE ERRORSxe2x80x9d and filed Mar. 31, 1999. The content of the above-referenced applications is incorporated herein by reference.
1. Technical Field
The present invention relates in general to data storage on disk storage media and in particular to error handling and recovery for disk storage media. Still more particularly, the present invention relates to relocating unreliable disk sectors when read errors are received while reading data.
2. Description of the Related Art
Accurate and prompt reporting of write errors or faults to a disk drive by device drivers, adapters, and/or disk drives when an attempted write to the hard disk drive is unsuccessful represents the ideal situation for data protection. Under these conditions, the system or user application has an opportunity to preserve the data by writing it elsewhere. However, the error may not be detected when the data is written, the error may not be properly reported if detected, or the data may be corrupted after being written to the disk media. The first two circumstances depend on the presence, reliability, and/or thoroughness of error detection, reporting and correction mechanisms for the disk drive, adapter, and device driver. The last circumstance results from failure of the disk media for any one of a number of reasons such as head damage to the disk media, stray magnetic fields, or contaminants finding their way into the disk drive.
Virtually all contemporary disk drives can detect and report a bad data read from the disk media, typically through CRC errors. When CRC errors are returned from reading a sector, often the read may be retried successfully, and most file systems simply continue if the data was successfully recovered from the sector.
A sector for which reads must be retried multiple times is likely to be xe2x80x9cfailing,xe2x80x9d or in the process of becoming unrecoverable. Once a sector becomes unrecoverable, disk drives will normally perform relocation of the bad sector to a reserved replacement sector on the drive. However, sectors are generally relocated only after they have become unrecoverable, and typically a sector which may be successfully read is deemed good regardless of the number of attempts required to read the data. This may result in loss of data since the sector was not relocated prior to the sector becoming unrecoverablexe2x80x94that is, prior to the data becoming unreadable and therefore xe2x80x9clost.xe2x80x9d
It would be desirable, therefore, to provide a mechanism for detecting and relocating failing or unreliable disk sectors prior to complete loss of data within the sector.
It is therefore one object of the present invention to provide improved data storage on disk storage media.
It is another object of the present invention to provide improved error handling and recovery for disk storage media.
It is yet another object of the present invention to provide a mechanism for relocating unreliable disk sectors when read errors are received while reading data.
The foregoing objects are achieved as is now described. Where a number n of read attempts are required to successfully read a data sector, with the first n-1 attempts returning a disk drive read error, the number of attempts required is compared to a predefined threshold selected to indicate that the sector is unreliable and is in danger of imminently becoming completely unrecoverable. If the threshold number of attempts is not exceeded, the sector is presumed to still be good and no further action need be taken. If the threshold number of attempts was equaled or exceeded, however, the unreliable or failing sector is relocated to a reserved replacement sector, with the recovered data written to the replacement sector. The failing data sector is remapped to the replacement sector, which becomes a fully functional substitute for the failing sector for future reads and writes while preserving the original user data. Data within a failing sector is thus preserved before the sector becomes completely unrecoverable.
The above as well as additional objects, features, and advantages of the present invention will become apparent in the following detailed written description.