This invention relates to disk storage systems, and in particular, to the anticipation of a user""s request for data from a disk.
A user who requires data from a disk initiates a complex and time-consuming sequence of events for retrieving that data. For example, in the course of retrieving that data, a disk controller must position a disk arm to align a read head with the cylinder on the disk that contains that data. The disk controller must then wait until the track containing the desired data begins passing under the read head. Then, when rotation of the disk finally brings the desired data under the read head, the disk controller must initiate the read operation. These events introduce considerable latency into the process of satisfying the user""s request for data.
The latency associated with positioning the read head at the beginning of a track is analogous to a fixed cost in an economic transaction. Once the disk storage system has incurred the latency associated with placing the head at the beginning of a track, it costs only a little additional time to read the entire track rather than merely the desired data.
Although the marginal cost of reading an entire track is low compared to the fixed cost of positioning the disk arm at the beginning of the track, it is nevertheless preferable to avoid it when possible. In particular, when a disk storage system services multiple users who access multiple disks, the unnecessary transmission of entire tracks consumes considerable bandwidth and thereby significantly interferes with disk access operations of other users.
Whether or not to read an entire track, rather than merely the data specifically requested from that track is a decision that requires the disk storage system to anticipate whether additional data from that track is likely to be needed in the future. In a known method for doing so, the disk storage system maintains a global cache memory that is accessible to both a host computer and to a back-end processor in communication with a multiplicity of disks. The global cache memory is divided into logical volumes consisting of a large number of slots, each of which is sized to correspond to a physical track on a disk. Each track on a disk is assigned to a logical volume consisting of a large number of other tracks. Portions of some of these tracks may have already been copied into corresponding slots in the global cache memory. A disk storage system having the foregoing structure is described in Bachmat, U.S. Pat. No. 6,003,114, the contents of which are herein incorporated by reference.
Upon receiving a request for data, the disk storage system first checks to see if that data is already in a cache slot. If the data is already in a cache slot, the disk storage system retrieves the data directly from the cache slot. Such an event is referred to as a xe2x80x9cread-hit.xe2x80x9d A read-hit is a desirable outcome because a read from the cache slot avoids latencies associated with reading from a physical disk drive.
In some cases, the disk storage system discovers that the desired data is not in the global cache memory at all. Instead, it resides on a disk. In this case, the disk storage system instructs a disk controller to retrieve the desired data from an appropriate track on a disk. Such an event is referred to as a xe2x80x9cread-miss.xe2x80x9d A read-miss is an undesirable outcome because such an operation is afflicted with latencies associated with mechanical motion within the disk drive and possible latencies associated with data transmission between the global cache memory and the disk drive.
In response to a read-miss, a back-end processor fetches the desired data and transmits it to the global cache memory. If the back-end processor detects a second request for data from the same track within a selected interval, it responds by fetching the remainder of the track.
A disadvantage of the foregoing method is that each response to a read-miss assumes that no additional data from the track will be needed in the near future. It makes this assumption even though that prior requests for data from the logical volume containing that track may have consistently resulted in additional requests for data from the same logical volume.
The method of the invention adaptively selects an optimal pre-fetch policy on the basis of the observed frequency of avoidable and unavoidable read-misses. As the relative frequencies of avoidable and unavoidable read-misses changes over time, the method of the invention causes the pre-fetch policy to switch between a first pre-fetch policy, in which a request for desired data from a data-set is satisfied by reading the desired data, and a second pre-fetch policy, in which a request for desired data from a data-set is satisfied by reading the data-set.
Upon the basis of statistics collected on the number of avoidable read-misses, a first threshold value is defined. When an unavoidable read-miss is detected, a random number is generated and compared with the threshold value. On the basis of a sign of a difference between the threshold value and the random number, the optimal pre-fetch policy is selected from the first and second pre-fetch policies.
The statistics for determining the frequency of avoidable read-misses are embodied in a random-walk variable whose value is updated in response to detection of an avoidable read-miss. The value of this random-walk variable is thus indicative of a likelihood of an avoidable read-miss. The value of the random-walk variable can also be updated in response to detection of an unavoidable read-miss.
The random-walk variable can be changed by determining a threshold read-miss probability at which the optimal pre-fetch policy transitions from the first pre-fetch policy to the second pre-fetch policy. The value of the random-walk variable is then changed by an amount that depends on the threshold read-miss probability.
The method can also include the step of classifying a read-miss as an avoidable-read miss or an unavoidable read-miss. This step can be performed by maintaining a flag associated with the data set, the value of which depends on whether or not data from that data set has previously been requested. This classification can be achieved by inspecting a flag associated with the data set, the flag being indicative of whether data from the data set has been previously requested. In the case of a distributed disk storage system made up of individual disk storage systems, the flag can also include information indicative of the identity of the system from which a request for data is made.