1. Technical Field of the Invention
This invention pertains to computer system communication. In particular, this invention provides low latency message reception and replication in an InfiniBand® multicast implementation.
2. Description of the Related Art
I/O adapters define queue pairs (QPs), comprising receive queues (RQ) and send queues (SQ), for conveying messaging information from a software consumer to the adapter prior to transmission over a network fabric and for receiving messages by the consumer from an adapter coupled to the network fabric. Industry standards, such as the InfiniBand® (IB) Architecture Specification available from the InfiniBand® Trade Association and IWarp from the RDMA Consortium, specify that the message information carried on QPs is in the form of a work queue element (WQE) that carries control information pertaining to the message. The above-identified documents are incorporated herein by reference in their entirety. Also, one or more data descriptors point to the message data to be transmitted or the location at which received messages are to be placed.
Low latency message passing is a critical function in high performance computing applications. Typical data exchanges between system memory and InfiniBand® adapters that are required to receive messages consume sizeable amounts of time.
Some RQ applications have a need to reduce the latency incurred during data transfer operations. There is a need for a mechanism to enhance the standard RQ operations so that the lower latencies required by these applications can be achieved.
Multicasting refers to sending a message or messages from a single source to many destinations. With reference to FIG. 7, there is illustrated a number of InfiniBand® nodes 701, 702, 703, coupled to switch/router 710 wherein each destination node or group of nodes (recipients) is identified by a unique Multicast Global ID (GID) in the header of a multicast packet. Switches forward to one or more output ports based on the LID (Local ID), and routers forward based on the GID (Global ID). Each node whose ports (P) are part of a multicast group identify themselves via a Multicast GID. Network management functions keep track of nodes and their ports that will receive targeted multicast messages. This information is distributed to IB network routers and switches, such as 710, for storage in routing tables. Thereby, each switch is configured with routing information for the multicast traffic which specifies all of the ports 711 where the packet needs to be forwarded.
The sender, e.g. 712, uses a multicast LID and GID in all packets it sends to a targeted multicast group. In the example illustrated in FIG. 7, the sender is a processor 712 in a host system or node 701, which owns and manages its own QP 713. The illustration of FIG. 7 is not intended to limit the number of processors or host channel adapters that can be implemented in the present invention. Preferably, the host system can include thirty-two processors, for example, with any number of such processors sharing one or more host channel adapters. When a switch 710 receives such a multicast packet with a multicast LID in the packet's DLID field it replicates the packet and sends copies of the packet to each of the designated ports 711. A router uses a DGID to determine which ports to forward the packet to. The GID is used to identify the multicast group and the QPs that are associated with it (i.e. 704, 705, 706). IB multicast spreads the load of replicating packets across switches, routers, and HCAs in the network fabric. As the network scales, so does the replication.