FIG. 1 is a block diagram of a prior art computer system 100. Its main components are a host computer 110, a disk controller 115 and a disk drive 125. Disk drive 125 may be a commercial, off-the-shelf, component, sometimes referred to as a commodity disk drive, that conforms to the small computer system interface (SCSI) protocol.
Host computer 110 operates on data stored on disk drive 125. When host computer 110 wishes to read data, it issues a read-command identifying the data to disk controller 115, which, in turn, issues the read-command to disk drive 125. When disk drive 125 executes the read-command, it sends the data to disk controller 115, which passes the data to host computer 110. For the case where host computer 110 wishes to write data, it issues a write-command, and sends associated data, to disk controller 115. Thereafter, disk controller 115 issues the write-command to disk drive 125. When disk drive 125 is prepared to execute the write-command, it notifies disk controller 115, which sends the associated data to disk drive 125.
Disk controller 115 includes a processor 117 and related memory 118 for executing procedures related to the exchanges of information with host computer 110 and disk drive 125. It also includes a controller command queue 120 that contains commands that have yet to be issued to disk drive 125.
Disk drive 125 also includes some local intelligence in the form of a processor (not shown) and related memory 128, which includes a disk drive command queue 130. Disk drive command queue 130 contains commands that have been issued to, but not yet been executed by, disk drive 125.
The performance of a system such as computer system 100 is often evaluated in terms of response time and throughput. Response time, also referred to as latency, is the interval of time between issuance of a command and when the command is executed. A short response time is preferable to a long response time. Throughput is the total number of commands processed by a system during a specified period of time. A greater throughput is preferable to a lesser throughput.
Disk controller 115 and disk drive 125 each include features that are intended to reduce response time and increase throughput. The features include prioritizing commands and organizing the commands on controller command queue 120 and disk drive command queue 130.
Disk controller 115 assigns a priority level to each command that it will issue to disk drive 125. Commands of greatest importance are assigned the highest priority level. Three priority levels are relevant to this discussion, i.e., demand stage, prestage and destage.
The demand stage priority level is highest, and is associated with commands where host computer 110 wishes to read data. Generally, when host computer 110 issues a read-command to disk controller 115, host computer 110 has an immediate need for the data. Accordingly, when disk controller 115 receives a read-command from host computer 110, it assigns the demand stage priority level to the read-command.
The prestage priority level is the next highest priority and is assigned to read-commands that are initiated by disk controller 115, as opposed to read-commands initiated by host computer 110. Disk controller 115 evaluates a recent history of commands received from host computer 110, and attempts to predict a next command that host computer 110 will issue. For example, in a case where host computer 110 has issued read-commands for two adjacent data blocks, disk controller 115 may predict that the next command from host computer 110 will be a read-command for a third adjacent data block. In anticipation of this command, disk controller 115 issues a read-command to disk drive 125 for the third adjacent data block. Such a read-command from disk controller 115, made in anticipation of a read-command from host computer 110, is assigned the prestage priority level.
The destage priority level is lower than the prestage priority level. Disk controller 115 assigns the destage priority level to write-commands for data to be written to disk drive 125. When host computer 110 writes data to a storage device, it generally has finished processing the data, at least for the short term. Host computer 110 passes the data to disk controller 115 and then moves on to other business. Host computer 110 does not wait for disk controller 115 to actually write the data to disk drive 125, so there is no immediate urgency for disk drive 125 to execute a write-command.
Disk controller 115 organizes commands on controller command queue 120 in order of priority. That is, commands with the highest priority level are placed at the head of controller command queue 120, while commands with the lowest priority level are placed at the tail. Commands with same priority level are ordered according to the amount of time they have been on queue. That is, commands are ordered according to priority and age. The command at the head of controller command queue 120 is the next command that disk controller 115 will issue to disk drive 125. This organization of commands on controller command queue 120 is intended to minimize response time for the highest priority commands.
Disk controller 115 can promote commands from the prestage priority level to the demand stage priority level. When such a command is promoted to demand stage, the command is placed after other commands of the demand stage priority. Note that commands of the destage priority level are not eligible for promotion.
Disk controller 115 also employs an aging algorithm hat advances the priority level of all commands on controller command queue 120 after a predetermined period of time. More specifically, after the predetermined time has elapsed, all commands on controller command queue 120 are advanced to a higher priority level. Note that the aging algorithm applies to all commands regardless of priority level. The aging algorithm is intended to prevent a low priority command from starving, i.e., not being serviced, in the case where newly received commands are of a higher priority level.
Disk drive 125 holds commands, which it has yet to execute, on disk drive command queue 130. Disk drive 125 can operate in either of two modes, i.e., In Order Mode or Reorder Mode. During In Order Mode, disk drive 125 places commands onto disk drive command queue 130, and executes the commands, in the order they have been received from disk controller 115. In Reorder Mode, disk drive 125 changes the order of commands on disk drive command queue 130 to minimize seek time and rotational latency between execution of consecutive commands, and thus improve throughput.
In Reorder Mode, as the number of commands on disk drive command queue 130 increases, disk drive 125 becomes more efficient because it has more commands from which to choose when selecting a next command to execute. Therefore, throughput increases. However, as the number of commands on disk drive command queue 130 increases, the potential maximum latency for a given command also increases. Also, note that disk drive 125 is not aware of the priority level used by disk controller 115, and consequently, in Reorder Mode it may execute one or more low priority commands before a high priority command.
Another feature, referred to as the Head of Queue Option, allows disk controller 115 to assert that a particular command is to be placed at the head of disk drive command queue 130. When disk controller 115 issues a command and designates the Head of Queue Option, the designation prevents disk drive 125 from reordering disk drive command queue 130, and the designated command is the next to be executed by disk drive 125. This feature minimizes latency for the designated command, but sacrifices disk drive efficiency and increases the latency of other commands that are on disk drive command queue 130.
A problem occurs when several successive commands are issued with the Head of Queue Option faster than the rate at which disk drive 125 can execute the commands. Under such a circumstance, three commands issued with the Head of Queue Option in the sequence of C1, C2 and C3 will be executed in the order of C3, C2 and C1. The oldest command, C1, will be the last executed regardless of its priority level. Also, because of the Head of Queue Option, the disk drive cannot reorder the commands to improve throughput. Consequently, the Head of Queue Option adversely impacts the response time of command C1 and the overall throughput of computer system 100.
An ideal system would minimize response time while maximizing throughput. The following patents are representative of some prior art techniques employed to address this challenge.
U.S. Pat. No. 4,425,615 to Swenson et al., entitled Hierarchical Memory System Having Cache/Disk Subsystem With Command Queues For Plural Disks, describes a disk subsystem including a plurality of disk drives wherein a command queue is provided for each disk drive. A priority value and a sequence number are assigned to each command queue so that the highest priority queued command number is executed when the disk drive corresponding to the queue becomes idle.
U.S. Pat. No. 5,548,795 to Au, entitled Method For Determining Command Execution Dependencies Within Command Queue Reordering Process, describes a method for calculating least-latency, maintaining the dependency information in a disk drive command queue, and using this information to constrain command reordering in a time and computationally efficient manner.
U.S. Pat. No. 5,469,560 to Beglin, entitled Prioritizing Pending Read Requests In An Automated Storage Library, describes an information processing system having a prioritized method of reading objects from disks in an automated storage library to minimize latency.
U.S. Pat. No. 5,729,718 to Au, entitled System For Determining Lead Time Latency As Function Of Head Switch, Seek, And Rotational Latencies And Utilizing Embedded Disk Drive Controller For Command Queue Recording, describes a system for reordering commands received by a disk drive. Lead time latencies are determined for commands in a queue with respect to an active command. The command having the least lead time latency is selected and promoted to the head of the queue where it will be executed after the active command.
U.S. Pat. No. 5,848,226 to Chen et al., entitled Prioritized Data Transfer Through Buffer Memory In A Digital Printing System, describes a control means within a disk drive. The control means assigns priority values to command outputs by software entities and executes the command having the highest priority.
The prior art techniques for reducing response time or maximizing throughput are generally directed to methods that actively reprioritize commands or reorder a queue to improve system efficiency. Typically, an improvement in terms of response time is accompanied by an impairment in throughput, or vice versa. Also, these techniques are often complex and are not necessarily compatible with commodity disk drives.
Accordingly, it is an object of the present invention to provide a disk controller and method for determining whether to issue a command to a disk drive while minimizing response time for high priority commands and maximizing throughput for all commands.
It is another object of the present invention to provide such a disk controller and method that is compatible with the operation of a commodity disk drive.