Computers use storage devices such as disk drives for permanently recording data. The computers are typically called “hosts” and the storage devices are called “drives.” A host can be connected to multiple drives, but a drive can also be connected to multiple hosts. Commands and data are transmitted by the host to the drive to initiate operations. The drive responds with formatted status, error codes and data as appropriate. Various standard command architectures have been adopted including, for example, Integrated Drive Electronics (IDE), Small Computer System Interface (SCSI) and Serial ATA (SATA).
The host computer can range in size from a small handheld device to a supercomputer cluster. The host can also be a special purpose device such as a digital camera. Similar data storage devices are used in a variety of applications including personal computers with less stringent demands, as well as large systems used by banks, insurance companies and government agencies with critical storage requirements.
A queue of commands for the storage system may be kept in the device's memory. A storage system can use the command queue to optimize the net execution time of commands by changing the order in which they executed. Among other criteria, prior art algorithms use seek time and rotational latency to optimize execution time. U.S. patent application 2006/0106980 by Kobayashi, et al. (published May 18, 2006) describes a hard disk drive that includes a queue capable of storing a plurality of commands, and a queue manager for optimizing the execution order of the plurality of commands on the basis of whether or not the execution of each command requires access to the storage medium.
A disk drive typically includes a high speed read-cache memory where selected sectors of data can be stored for fast access. A read-cache contains copies of a subset of data stored on the disk. The cache typically contains recently read data but may also contain pre-fetched sectors that occur immediately after the last one requested. A read command can be satisfied by retrieving the data from the cache when the needed data happens to be in the cache. Operations performed using only the drive's read-cache are much faster than those requiring that the arm be moved to a certain radial position above the rotating disk and having to wait for the disk to rotate into proper position for a sector to be read.
A write-cache can also be used for data that is in the process of being written to the disk. There is a critical window of time in a write operation between placing the data in the cache and actually writing the data to the disk when a power failure, for example, can cause the data to be lost. However, having the host wait until the relatively slow write process has completed can be an unnecessary inefficiency in many cases. The waiting time is justified for some data but not for all data. A so-called fast write operation simply places the data in the write-cache, signals the host that the operation is complete and then writes the data to disk at a subsequent time, which can be chosen using optimization algorithms that take into account all of the pending write commands.
Prior art command architectures have provided ways for a host to send a particular command or parameter to the drive to ensure that the data is written to the disk media before the drive signals that the write operation is complete. Writing data on the media is also called committing the data or writing the data to permanent storage.
One type of prior art command (cache-flush) directs the drive to immediately write all of the pending data in the cache to the media, i.e., to flush the cache. Flushing the entire cache on the drive may take a significant amount of time, and if done too often, reduces the benefit of the cache. Also known in the prior art is a write command with a forced unit access (FUA) flag or bit set. A write with FUA flag set will cause the drive to completely commit the write to non-volatile storage before indicating back to the host that the write is complete.
Storage systems running in an adverse environment (e.g. extreme temperature, high vibration, etc.) need to verify each write in order to increase/maintain their reliability. Unfortunately verifying every write can reduce the write throughput, because the device must wait until the disk completes a rotation before the sector can be read back. This one revolution delay substantially reduces the performance of the device. If the write failed yet another delay for rotation of the disk is needed to rewrite the data sector. Methods for reducing the impact of write verification are needed.
In U.S. Pat. No. 6,854,022 Gregory B. Thelin describes a disk drive using rotational position optimization algorithm to facilitate write verify operations. The write data can be maintained in the cache until the write-verify operation is completed. If the write-verify operation fails then the data in the cache can be rewritten to the disk. Thelin teaches execution of a write verified command according to a rotational position optimization algorithm rather than immediately after the write command to better optimize drive performance relative to mechanical latencies. Thelin's disk drive includes an input/output (I/O) queue for storing read and write commands received from a host computer, and a disk controller for executing the commands stored in the I/O queue in an order determined from a rotational positioning optimization (RPO) algorithm. The disk controller selects a write command from the I/O queue according to the RPO algorithm, seeks the head to a target track, and writes data to a target data sector. After executing the write command, the disk controller inserts a write verify command into the I/O queue. The disk controller then selects the write verify command from the I/O queue according to the RPO algorithm and executes the write verify command to verify the recoverability of the data written to the target data sector.
In U.S. Pat. No. 7,120,737 Thelin describes a disk drive employing a disk command data structure for tracking a write verify status of a write command. A microprocessor executes a write command associated with a disk command data structure by inserting the disk command data structure into a “dirty queue”, and then executing the write command using the disk command data structure by writing data blocks to a plurality of target data sectors. The disk command data structure is then inserted into a write verify queue, and the disk command data structure is used to perform a write verify operation. The disk command data structure is inserted back into the dirty queue if at least one of the target data sectors fails the write verify operation.
U.S. Pat. No. 5,872,800 to Glover, et al. describes a write verify method for correcting unrecoverable sectors in a disk storage system using track level redundancy. Each track comprises a redundancy sector for reconstructing an unrecoverable data sector. The latency of the storage system is said to be minimized by generating track level redundancy data over the write range of data sectors and storing the “write” redundancy to the redundancy sector. During idle time of the storage system, the track level redundancy is regenerated for the entire track. If an unrecoverable data sector is encountered during the idle time redundancy regeneration, and the unrecoverable data sector is within the write range of the previous write operation, then it is reconstructed using the track level redundancy data stored in the redundancy sector.
U.S. Pat. No. 6,289,484 to Rothberg, et al. describes a disk drive employing off-line scan to collect selection-control data for subsequently deciding whether to verify after write. A disk drive that includes a firmware-controlled state machine with an off-line in-progress state is used to implement a scan of the multiplicity of sectors. While performing the firmware-controlled scan, steps are performed to maintain a list of sector identifiers such that each sector identifier in the list points to a sector that has failed, preferably repeatedly, to provide valid data on the fly. While the state machine is not in the offline in-progress state; the drive responds to a request to write data at a specified sector by determining whether the specified sector matches a sector identifier in the list, and if so, autonomously performing a read-verify-after-write operation.