1. Technical Field
The present invention relates generally to communications between computer systems and, more particularly, to a method and apparatus for a single InfiniBand chip which can support end node partitioning by enabling virtualization of an InfiniBand Host Channel Adapter (HCA) and router components.
2. Description of Related Art
In a System Area Network (SAN), the hardware provides a message passing mechanism which can be used for Input/Output devices (I/O) and interprocessor communications between general computing nodes (IPC). Consumers access SAN message passing hardware by posting send/receive messages to send/receive work queues on a SAN channel adapter (CA). The send/receive work queues (WQ) are assigned to a consumer as a queue pair (QP). The messages can be sent over five different defined transport types: Reliable Connected (RC), Reliable datagram (RD), Unreliable Connected (UC), Unreliable Datagram (UD), and Raw Datagram (RawD). In addition, there is a set of manufacturer definable operation codes that allow for different companies to define custom packets that still have the same routing header layouts. Consumers retrieve the results of the defined messages from a completion queue (CQ) through SAN send and receive work completions (WC). The manufacturer definable operations are not defined as to whether or not they use the same queuing structure as the defined packet types. Regardless of the packet type, the source channel adapter takes care of segmenting outbound messages and sending them to the destination. The destination channel adapter takes care of reassembling inbound messages and placing them in the memory space designated by the destination's consumer. Two channel adapter types are present, a host channel adapter (HCA) and a target channel adapter (TCA). The host channel adapter is used by general purpose computing nodes to access the SAN fabric. Consumers use SAN verbs to access host channel adapter functions. The software that interprets verbs and directly accesses the channel adapter is known as the channel interface (CI).
The InfiniBand network is broken up into separate autonomous management units (each containing multiple IB nodes) called subnets. InfiniBand components are assigned a Global Identifier (GID) during initialization. The GID is used to uniquely identify the target component both within and across IB subnets. Communications among components that reside in different IB subnets are provided by including an additional header called a Global Routing Header (GRH) being included in every IB packet, defining both the source and the destination addresses/nodes. These additional addresses allow routers that span subnets to determine the path that is to be taken for the packet to reach its ultimate destination (i.e. target GID). Unlike within subnet communications where a direct path can be obtained to the target (i.e. LID), cross subnet communications typically requires one or more hops through intermediate router(s).
InfiniBand does not define a mechanism that allows a single physical IB node (e.g. a host channel adapter) to transparently implement one or more entire IB subnets. However, a single physical IB node transparently implementing one or more entire IB subnets becomes a highly desirable feature, especially for environments where a large number of servers (e.g., Linux) are implemented within a single physical machine (e.g., IBM's z/VM).
Therefore, a mechanism is needed in environments containing a large number of servers to allow economies of scale cost reductions to be achieved by sharing a single physical IB node across potentially many server images. This mechanism must not incur significant mainline processing overheads and must allow the resulting overall solution to be competitive within the marketplace.