In general, this invention relates to hard disk drive technology. More particularly, it relates to a hard disk drive that employs an off-line scan to collect and store selection-control data for subsequently deciding whether to verify after write.
Customer requirements for hard disk drives include challenging specifications for performance (access time, data throughput, etc.) and capacity (contemporary capacity specifications are well in excess of 1 gigabyte per drive). These customer requirements also include exacting standards with respect to data-reproduction reliability. Ideally, output data from a hard disk drive would always accurately and completely reproduce input data previously transferred to be stored by the disk drive. As a practical matter, defects either in the construction or operation of a drive inevitably occur, and such defects pose problems in meeting customer specifications which have as one of their purposes to ensure that any such defect does not cause data loss.
Such defects can involve media defects which may be localized and may be a repeatable source of error. Various defects in operation can involve high-fly writes, thermal asperities, random noise, etc.
Disk drive technology presently provides for error detection and multiple levels of error-correction capability making it possible for a drive to return the same data to the host even though, internally within drive operation, some type of defect caused corrupted data to be read. A robust process for error detection entails an encoding process and a decoding process. In the encoding process, data received from a host is combined with redundancy data to form codewords which are written to disk. In the decoding process, syndromes are generated, and the syndromes indicate whether valid data (i.e., codewords) or corrupted data have been read. One level of error correction capability is referred to as "on-the-fly" error correction. There are limits to the capability of on-the-fly error-correction capability. For example, some contemporary drives provide a "T=6" limit, which means that each of up to six invalid bytes within a codeword can be corrected on the fly. If corrupted data have been read within the capability of the on-the-fly level, then an error-locator and error mask are quickly generated and the corrupted data are promptly corrected. The on-the-fly level of error-correction does not require a retry and accordingly does not suffer from the penalty of substantial rotational latency. Others of the multiple levels are generally microprocessor controlled, and considerably more time consuming. Under microprocessor control a sector that failed to provide valid data on the fly can be re-read on a subsequent revolution of the disk. Sometimes, a sector that failed to provide valid data on the fly on one try will, on a retry, provide valid data on the fly. If a sector has repeatedly failed to provide valid data on the fly, certain "heroic" measures can be taken such as repositioning of the head stack assembly in advance of a retry. Furthermore, assumptions can be made as to the location of errors, and microprocessor-controlled error-correction processes can be carried out.
Additional background information relevant here is set forth in a U.S. Patent Application referred to herein as the "Barr Application"; that is, Ser. No. 08/644,507, filed May 10, 1996, now U.S. Pat. No. 5,909,334, entitled "VERIFYING WRITE OPERATIONS IN A MAGNETIC DISK DRIVE" which names L. Barr and A. Sareen as joint inventors and which is assigned to the assignee of this application. The disclosure of the Barr Application is hereby incorporated herein. Briefly, the Barr Application discloses a method of operating a drive that is connected to a host computer and that includes a rotating magnetic disk and a buffer memory. The method of the Barr Application comprises the steps of receiving a block of data from the host computer and storing the block of data in the buffer memory as it is being received. A series of steps are performed while the block of data remains in the buffer memory,. This series of steps includes a writing step by which a copy of the block of data and redundancy data is written to the disk. This series of steps also includes an autonomous read-verify step involving autonomously reading the block of data and the redundancy data from the disk on a succeeding rotation. While autonomously reading the block of data and the redundancy data, the read-verify step suppresses transfer to the buffer memory of the data so read; and the read-verify step tests whether the data so read are within correctable limits.
As for high-speed operation, an issue arises with respect to responding to a queue of requests. The time required to complete an autonomous read-verify after write operation can delay response to a subsequent request.
Another approach directed to reducing the risk of loss of data involves various ways to predict a forthcoming drive failure and providing a notice of the impending failure. An industry-sponsored committee has developed a relevant standard. This committee is referred to as the "Small Form Factor Committee." This standard is identified as the "Specification for S.M.A.R.T. SFF-8035 ver. 2.0." (S.M.A.R.T. is an acronym for Self-Monitoring, Analysis, And Reporting Technology.)
As for drives that provide for predicting an impending failure, such prior art drives are subject to adverse performance slowdowns with respect to reading data from sectors that fail to provide valid data on the fly.
The execution of firmware-controlled heroic-recovery techniques can consume seconds of time, this being substantially more time than the time in the order of milliseconds normally involved in completing a read command. It is highly undesirable for an operating system or application program to wait repeatedly a few seconds each time heroic-recovery techniques are invoked.
As for drives that employ autonomous verification operations as part of the overall write process, such prior art drives involve adverse performance slowdowns with respect to writing data to the sectors.
In summary of the foregoing, despite the efforts in the prior art, there remains a need to provide a comprehensive solution to meet dual goals of high reliability and fast performance--i.e., enhancing the reliability of accurate readout of data, without causing a slowdown affecting an operating-system or application program.