The computer industry is moving toward fast, packetized, serial input/output (I/O) bus architectures, in which computing hosts and peripherals are linked by a switch network, commonly referred to as a switch fabric. A number of architectures of this type have been proposed, culminating in the “InfiniBand™” (IB) architecture, which has been advanced by a consortium led by a group of industry leaders (including Intel, Sun Microsystems, Hewlett Packard, IBM, Dell and Microsoft). The IB architecture is described in detail in the InfiniBand Architecture Specification, Release 1.0 (October, 2000), which is incorporated herein by reference. This document is available from the InfiniBand Trade Association at www.infinibandta.org.
Computing devices (hosts or peripherals) connect to the IB fabric via a network interface adapter, which is referred to in IB parlance as a channel adapter. The IB specification defines both a host channel adapter (HCA) for connecting a host processor to the fabric, and a target channel adapter (TCA), intended mainly for connecting peripheral devices to the fabric. Typically, the channel adapter is implemented as a single chip, with connections to the computing device and to the network. Client processes running on a computing device communicate with the transport layer of the IB fabric by manipulating a transport service instance, known as a “queue pair” (QP), made up of a send work queue and a receive work queue. The IB specification permits the HCA to allocate as many as 16 million (224) QPs, each with a distinct queue pair number (QPN). A given client process (referred to simply as a client) may open and use multiple QPs simultaneously.
To send and receive communications over the network, the client initiates work requests (WRs), which cause work items, called work queue elements (WQEs), to be placed in the appropriate queues. The channel adapter then executes the work items, so as to communicate with the corresponding QP of the channel adapter at the other end of the link. In both generating outgoing messages and servicing incoming messages, the channel adapter uses context information pertaining to the QP carrying the message. The QP context is created in a memory accessible to the channel adapter when the QP is set up, and is initially configured with fixed information such as the destination address, negotiated operating limits, service level and keys for access control. Typically, a variable part of the context, such as the current packet sequence number (PSN) and information regarding the WQE being serviced by the QP, is subsequently updated by the channel adapter as it sends and receives messages. After it has finished servicing a WQE, the channel adapter may write a completion queue element (CQE) to a completion queue, to be read by the client.
The QP that initiates a particular operation, i.e. injects a message into the fabric, is referred to as the requester, while the QP that receives the message is referred to as the responder. An IB operation is defined to include a request message generated by the requester and, as appropriate, its corresponding response generated by the responder. (Not all request messages have responses.) Each message consists of one or more IB packets. A given channel adapter is typically configured to serve simultaneously both as a requester, transmitting requests and receiving responses on behalf of local clients, and as a responder, receiving requests from other channel adapters and returning responses accordingly.
IB request messages include, inter alia, remote direct memory access (RDMA) write and send requests, RDMA read requests, and atomic read-modify-write requests. Both RDMA write and send requests cause the responder to write data to a memory address at its own end of the link. Whereas RDMA write requests specify the address in the remote responder's memory to which the data are to be written, send requests rely on the responder to determine the memory location at the request destination. Therefore, to process incoming send requests, the destination computing device must generate receive WQEs and place them in the proper receive queues in its memory. Each receive WQE includes a scatter list indicating a location or locations available in the memory of the destination computing device. Whenever a valid send request is received, the destination channel adapter takes the next WQE from the receive queue and places the received data in the memory location(s) specified in the scatter list of that WQE. Typically, the channel adapter then places a CQE on the completion queue, indicating to the computing device that the receive operation was completed. Thus, every valid incoming send request engenders a receive queue operation by the remote responder.