1. Technical Field
This application relates to computer storage devices, and more particularly to the field of pre-fetching certain data from a disk of a storage device to cache memory thereof.
2. Description of Related Art
Host processor systems may store and retrieve data using a storage device containing a plurality of host interface units (host adapters), disk drives, and disk interface units (disk adapters). Such storage devices are provided, for example, by EMC Corporation of Hopkinton, Mass. and disclosed in U.S. Pat. Nos. 5,206,939 to Yanai et al., 5,778,394 to Galtzur et al., U.S. Pat. No. 5,845,147 to Vishlitzky et al., and U.S. Pat. No. 5,857,208 to Ofek. The host systems access the storage device through a plurality of channels provided therewith. Host systems provide data and access control information through the channels of the storage device and the storage device provides data to the host systems also through the channels. The host systems do not address the disk drives of the storage device directly, but rather, access what appears to the host systems as a plurality of logical volumes. The logical volumes may or may not correspond to the actual disk drives.
A user who requires data from a disk initiates a complex and time-consuming sequence of events for retrieving the data. For example, in the course of retrieving data, a disk controller positions a disk arm to align a read head with the cylinder on the disk that contains that data. The disk controller then waits until the desired data begins passing under the read head. Then, when rotation of the disk finally brings the desired data under the read head, the disk controller initiates the read operation. These events introduce considerable latency into the process of satisfying the user's request for data.
A global cache memory, which is a relatively fast memory that is separate from the disks, may be used to address some of the latency issues associated with disks. The global cache memory may contain recently fetched (requested) data. Upon receiving a request for data, the disk storage system first checks to see if the requested data is already in the global cache memory. If so, the disk storage system retrieves the data directly therefrom without having to access the disk. Such an event is referred to as a “read-hit.”A read-hit is a desirable outcome because a read from the cache slot avoids latencies associated with reading from a physical disk drive.
In some cases, the disk storage system discovers that the desired data is not in the global cache memory at all but, instead, is on a disk. In this case, the disk storage system instructs a disk controller to retrieve the desired data from an appropriate track on a disk. Such an event is referred to as a “read-miss.” A read-miss is an undesirable outcome because such an operation is afflicted with latencies associated with mechanical motion within the disk drive and possible latencies associated with data transmission between the global cache memory and the disk drive.
Data on a disk may be organized in tracks, which represent an amount of contiguous memory that may be read in a single disk drive operation. The global cache memory may be organized into slots, where each slot corresponds to a track. Whether or not to read an entire track, rather than merely the data specifically requested from that track, is a decision that requires the disk storage system to anticipate whether additional data from that track is likely to be needed in the future. U.S. Pat. No. 6,003,114 to Eitan Bachmat, the contents of which are herein incorporated by reference, describes a system where, in response to a read-miss, the requested data is fetched and stored in the global cache memory. If there is a second request for data from the same track while data for the track is still in cache, the remainder of the track is fetched.
A disadvantage of the foregoing technique is that, the second read miss to the same slot becomes unavoidable U.S. Pat. No. 6,529,998 to Yechiel Yochai and Robert Mason, which is incorporated by reference herein, addresses this issue by reading more than just the requested data based on metrics that do not depend directly on whether other data from a particular track has been accessed recently. U.S. Pat. No. 6,529,998 to Yechiel Yochai and Robert Mason discloses reading to the end of a track in instances where it appears statistically advantageous to do so based on a determination of the number of times a read miss would have been avoided had a previous read within the track been a read to the end of the track.
While the technique disclosed in 6,529,998 improves performance in many instances, it does not take into account the possibility of reads in a track being out of order. That is, an application may cause a first read to occur at one portion of a track and cause a second read to occur at a different portion of the track. If the start of the first portion is after the start of the second portion, then even if the first read is to the end of the track, the second read will still cause a read miss. Furthermore, the first read to the end of the track will not be useful. Accordingly, it is desirable to be able to reduce the number of instances where out of order reads within a track result in read misses.