The present invention relates generally to the data processing field, and more particularly, relates to a system and method for scheduling hard disk drive random commands with locational uncertainty.
In random access storage devices, such as hard disk drives, when there are more than one command to execute, the data to be accessed next is chosen from a list or a queue of outstanding commands. The hard disk drive includes firmware performing a scheduling algorithm to determine the optimal command execution order. In general, the goal of the scheduling algorithm is to minimize the average access time for its commands. Presently, hard disk drives use a Shortest-Access Time First (SATF) algorithm.
The conventional SATF algorithm works as follows: Given a set of commands in a queue, a command is chosen that can be started or accessed first. This calculation has two parts, the time to perform the seek and settle operation from the current cylinder to the target cylinder and the latency between this point and when the starting sector for the command is reached. The SATF algorithm depends on accurate estimates of this access time. If the estimate is too low, the actuator may settle on track after the desired sector has already passed rotationally. This is called a miss. A miss adds one full revolution to the access time, degrading performance. If the access time estimate is too high, the optimal command candidate is not chosen for execution.
To implement a SATF algorithm, the latency between the current position and the target position must be analyzed. The command having the shortest access time and an acceptable probability of success is chosen by the algorithm. The probability of success is the probability that the command will be executed in the expected amount of time without one or more missed revolutions.
One problem with the typical SATF algorithm is that each command in the queue is classified in a binary manner. Desirable candidates are determined to be either reachable or unreachable in a given number of revolutions. Unfortunately, due to the large number of external factors affecting actual seek performance binary decisions are inadequate. To implement a SATF algorithm, the latency between the current position and the target position must be analyzed. The command having the shortest access time and an acceptable probability of success is chosen by the algorithm. The probability of success is the probability that the command will be executed in the expected amount of time without one or more missed revolutions. Using the probability of success to calculate an expected value is preferred.
One underlying assumption made with all SATF algorithms is that both the last sector of the current command and the first sector of the candidate command can be accurately determined. If the last sector of the current command and the first sector of the candidate command are not known, the latency between commands cannot be accurately determined and thus the access time cannot be accurately calculated. In the normal case of command scheduling, the next command to execute must be chosen before the completion of the current command. Otherwise, the average command time for a given set of commands would increase by the time required to sort the commands. Thus, given the timing of the sort and locational uncertainty for target commands, both the ending location of the current command and the starting location of each candidate command are unknown.
There are a number of potential applications where locational uncertainty are introduced, but for which a substantial competitive advantage could be obtained. This competitive advantage can only be realized however, if the performance impact of not knowing the exact starting location of a command can be minimized.
Examples of features that provide significant competitive advantage but at the cost of locational uncertainty include compression and a file system on a disk. Nearly any variable-bit rate compression scheme introduces locational uncertainty. An approximate location is determined, and the actual location of the data identified when the data is actually read. Since the exact location is determined after the seek has been completed; only the estimated location is available at the time command scheduling is done. In the file system on a disk, information regarding the location of a section of data could be stored on the disk, rather than being stored in random access memory (RAM). Then an approximate location is determined, the file system data is read and the actual location of the data identified. Given the potential gain from these and other features, as well as the potential performance impacts, a way to minimize the performance impact of locational uncertainty is needed.
A need exists for an improved system and method for hard disk drive random command queue ordering to minimize the performance impact of locational uncertainty of commands.
A principal object of the present invention is to provide an improved system and method for hard disk drive random command queue ordering. Other important objects of the present invention are to provide such system and method for hard disk drive command queue ordering that efficiently and effectively facilitates hard disk drive command queue ordering while minimizing the performance impact of locational uncertainty of commands and enabling expected access time accuracy; to provide such system and method for hard disk drive command queue ordering substantially without negative effect and that overcome many of the disadvantages of prior art arrangements.
In brief, a system and method are provided for hard disk drive command queue ordering with locational uncertainty of commands. For each candidate command in the hard disk drive command queue, an expected access time is calculated utilizing a probability distribution for a candidate command. A command in the hard disk drive command queue having a minimum calculated expected access time is identified. Then the identified command having a minimum calculated expected access time is executed.
In accordance with features of the invention, a probability distribution for a currently executing command represents an ending location distribution for the currently executing command. The probability distribution for a candidate command represents a starting location distribution for the candidate command. For an estimated seek time of less than a time for one full revolution, a probability of a miss multiplied by a time of one extra revolution and multiplied by a candidate arrival probability is calculated and the result is added to an estimated seek time to provide the expected access time. For an estimated seek time of greater than a time for one full revolution, a probability of a make multiplied by a time of one extra revolution and multiplied by a candidate arrival probability is calculated and the result is subtracted from an estimated seek time to provide the expected access time.