1. Field of the Invention
This invention relates to computer systems and, more particularly, to network interconnection hardware and software.
2. Description of the Related Art
The use of modern computer systems is placing an increased demand on computer system network bandwidth. Higher performance system servers, mass storage devices and input/output devices with higher bandwidth and lower latency have been outpacing the existing interconnection technologies. Thus a system interconnection solution was needed that would overcome some of the bandwidth and latency problems associated with existing interconnection technologies.
One such interconnection solution is the Infiniband(trademark) Switched Fabric. The Infiniband architecture is a point-to-point interconnection fabric where nodes are interconnected via switching devices. In particular, the Infiniband architecture describes a system area network in which independent processing nodes may be interconnected with I/O devices. The Infiniband architecture is described in detail in the Infiniband(trademark) Architecture Specification available from the Infiniband(trademark) trade association.
The Infiniband specification defines an interface between a processing node operating system and the node""s hardware interface to the fabric. The hardware interface on a processing node is referred to as a hardware channel adapter. One fundamental idea behind Infiniband is the ability of a client process to place instructions in a queue, for the hardware to execute. The queue is referred to as a work queue. Each work queue is created in a pair called a queue pair. Each queue pair has a send queue and a receive queue. The queue pair creates a virtual communication port for a client process to communicate with other processes and end nodes. A queue pair is an abstract construct of memory locations and may have a predetermined number of entries that hold a predetermined number of work queue elements. An instruction is stored in the queue pair in the form of a work queue element. The hardware channel adapter services work queue elements in the order they are received into the queue pair.
Although the Infiniband specification describes a channel interface between the operating system and the fabric, the Infiniband specification does not specify how the hardware channel adapter or the software driving the hardware channel adapter must be implemented. Therefore, a solution is needed to bridge a host processing node to the Infiniband fabric.
Various embodiments of a method and apparatus for manipulating work queue elements are disclosed. In one embodiment, a hardware channel adapter may service work queue elements that are stored in a queue pair. The hardware channel adapter may include hardware registers that may track work queue elements that are currently being serviced and which work queue element will be serviced next. A software driver may cause the work queue elements to be stored in the queue pair. The software driver notifies the hardware channel adapter when there are new work queue elements to service. Additionally, each work queue element may include an indication of whether the work queue element has completed. The software driver may cause a new work queue element to be stored in the location previously occupied by the completed work queue element upon detecting the completion indication. Thus, the combination of the hardware channel adapter and the software driver may allow for work queue elements to be serviced in a first come first served manner even if the work queue elements become non-contiguous in memory due to the availability of free locations resulting from out of order completions and subsequent use of those free locations by the software driver as it adds new work queue elements to the queue.
Broadly speaking, in one embodiment, an apparatus including a software driver and a hardware adapter is contemplated. The software driver is configured to cause a plurality of work queue elements to be stored in a queue pair including a plurality of storage locations. Each of the plurality of storage locations includes an indicator indicating whether a corresponding work queue element has been completed. The hardware adapter is configured to select one of the plurality of storage locations and to service a corresponding one of the plurality of work queue elements, and in response to completion of a task associated with the corresponding work queue element, to cause the indicator to indicate that the corresponding work queue element has been completed. Additionally, the software driver is configured to cause a new work queue element to be stored in the selected storage location in response to detecting that the indicator indicates that the corresponding work queue element has been completed.
In one particular implementation, the hardware adapter includes a first register for storing a virtual address of the selected storage location and the corresponding work queue element. The hardware adapter further includes a second register for indicating a number of pending work queue elements remaining to be serviced. The software driver is further configured to notify the hardware adapter when the new work queue element is stored by causing the virtual address of the new work queue element to be written to the first register of the hardware adapter. The hardware adapter is further configured to increment the second register in response to receiving the nofification from the software driver.