Many high-performance I/O interface devices have buffers in which to establish a queue of read and write commands. Using the buffer to establish the queue of I/O commands allows a processor or processors of a computer system to which the I/O interface device is attached to continue other computational functions, while the read and write I/O commands are processed separately. The I/O commands may be processed by a state machine or by a separate processor which is functionally part of the I/O interface device. As a result, the main computational processing functions are not delayed while awaiting the completion of the I/O commands, and the processing functionality of the computer system is enhanced.
One typical use of a queue command buffer is in a bus interface device, such as a conventional PCI bus interface device which is described in the PCI 2.2 protocol specification. The queue command buffer is typically a first-in first-out (FIFO) buffer which contains the read and write I/O commands that are to be completed. The commands in the command buffer are completed in a FIFO fashion, assuring that each command will ultimately be completed. New I/O commands are written into the top of the queue of the FIFO command buffer, and when a previous command has been fully completed the command is unloaded from the bottom of the queue of the FIFO buffer.
Upon attempting to complete a read command from the command buffer and failing to receive data in response to that read command, a queue pointer remains at the position of the read command which has incurred a delayed response. Not unloading the delayed read command from the queue of in the FIFO buffer causes the delayed read command to be retried until a response is received. This type of continued attempted completion of the delayed read command is known as a spin on single retried request. A spin on single retried request permits issuing only one read command until that read command has been completed. A spin on single retried request is achieved by maintaining the position of the queue pointer at the delayed read command until that read command is completed, at which time the then-completed read command is unloaded from the queue of the FIFO buffer.
Another type of technique for handling delayed read commands in the queue of the FIFO buffer is known as head-of-list alternation. Head-of-list alternation involves an added capability to alternate or swap another read command within the FIFO buffer in place of the delayed read command at the head of the list in the queue. Thus, upon encountering a first delayed read command, and if the next command in the FIFO buffer is also a read command, the relative position of the first delayed command and the next read command is alternated, so that an attempt is made to complete the next read command while the first read command is delayed. After the swap or alternation, completion of the second command is attempted. If the second command is successfully completed, it is unloaded from the queue and completion of the first delayed read command is again attempted. If the first read command is again delayed, the head-of-list alternation will again seek to substitute another read command following the first delayed read command, if another such read command is available in the queue. However, if the next command in the FIFO buffer is not a read command, the first delayed read command is again retried until it is completed. This head-of-list alternation therefore works only if two read commands are available in sequential positions in the queue of the FIFO buffer. If a command other than a read command follows a delayed read command, head-of-list alternation is not possible.
Head-of-list alternation between delayed read commands is more efficient than a spin on single retried request of the first delayed read command, because alternating between two read commands offers the opportunity to enqueue two read commands to target devices, such as memories or disk drives, for response and offers the possibility of receiving a response from one of the two enqueued read commands during the waiting time that would normally be encountered while waiting for a response to only a single enqueued read command. The latency in response of a target device to a read command is spread over two target devices, and the latency is thereby diminished in relation to the number of read commands which are completed. As a consequence, the data throughput is enhanced compared to the data throughput achieved when a single delayed read command must be retried continually before any other commands in the queue command buffer can be completed.
Head-of-list alternation works for two sequential read commands in the FIFO buffer because there are never any gaps between read commands. If a gap between read commands exists, head-of-list alternation is not performed and instead, spin on single retried request is performed until the delayed read command is completed. Head-of-list alternation is accomplished only because of the ability to swap the two sequential read commands until one of them is completed at the top of the list and is unloaded from the FIFO buffer.
Although the PCI 2.2 protocol specification theoretically supports the concept of extending the number of delayed read commands beyond two, no specific technique has been described for doing so. Substantial complexities are encountered when attempting to expand the number of delayed read commands beyond two, particularly in regard to handling those delayed read commands that may have been completed between the first and the last ones of a greater number of delayed read commands. The PCI 2.2 protocol specification does not specifically address a capability for adjusting the depth or content of the number of delayed read commands between the first and last delayed read commands. Consequently, head-of-list alternation offers the possibility of completing two sequential delayed read commands, but does not extend in a straightforward manner to the possibility of attempting completion of three or more delayed read commands. In some computer systems, head-of-list alternation offers only slightly increased performance (reduced latency) compared to spin on single retried request performance, because of the extent of the delays encountered in response to read commands in complex modern computer systems.
These and other considerations have given rise to the present invention.