InfiniBand™ (IB) is a switched-fabric communications architecture that is widely used in high-performance computing. It has been standardized by the InfiniBand Trade Association in the InfiniBand™ Architecture Specification Volume 1 (Release 1.3, Mar. 3, 2015). Computing devices (host processors and peripherals) connect to the IB fabric via a network interface controller (NIC), which is referred to in IB parlance as a channel adapter. Host processors (or hosts) use a host channel adapter (HCA), while peripheral devices use a target channel adapter (TCA).
Client processes running on a host processor, such as software application processes, communicate with the transport layer of the IB fabric by manipulating a transport service instance, known as a “queue pair” (QP), made up of a send work queue and a receive work queue. To send and receive messages over the network using a HCA, the client initiates work requests (WRs), which cause work items, called work queue elements (WQEs), to be placed onto the appropriate work queues. Normally, each WR has a data buffer associated with it, to be used for holding the data that is to be sent or received in executing the WQE. The HCA executes the WQEs and thus communicates with the corresponding QP of the channel adapter at the other end of the link. After it has finished servicing a WQE, the HCA typically writes a completion report, in the form of a completion queue element (CQE), to a completion queue, to be read by the client as an indication that the work request has been executed.
IB channel adapters implement various service types and transport operations, including remote direct memory access (RDMA) read and write operations, as well as send operations. Both RDMA write and send requests carry data sent by a channel adapter (known as the requester) and cause another channel adapter (the responder) to write the data to a memory address at its own end of the link. Whereas RDMA write requests specify the address in the remote responder's memory to which the data are to be written, send requests rely on the responder to determine the memory location at the request destination.
Upon receiving a send request addressed to a certain QP, the channel adapter at the destination node places the data sent by the requester into the next available receive buffer for that QP and generates a CQE to notify the responder as to the location of the data. To specify the receive buffers to be used for such incoming send requests, a client process on the host computing device generates receive WQEs and places them in the receive queues of the appropriate QPs. Each time a valid send request is received, the destination channel adapter takes the next WQE from the receive queue of the destination QP and places the received data in the memory location specified in that WQE. Thus, every valid incoming send request engenders a receive queue operation by the responder.
U.S. Patent Application Publication 2015/0269116, whose disclosure is incorporated herein by reference, describes remote transactions using transactional memory, which are carried out over a data network between an initiator host and a remote target. The transaction comprises a plurality of input-output (IO) operations between an initiator network interface controller and a target network interface controller. The IO operations are controlled by the initiator network interface controller and the target network interface controller to cause the first process to perform accesses to the memory location atomically.