1. Field
This invention relates to Infiniband architecture subnets, and more specifically to accessing service agents on non-subnet manager hosts in an Infiniband subnet.
2. Background
The Infiniband architecture defines a system area network (SAN) for connecting multiple independent processor platforms (i.e., host processor nodes), I/O platforms, and I/O devices in a cluster across a switched communications fabric that allows many devices to concurrently communicate, and allows for higher performance and better reliability, accessibility and serviceability (RAS) characteristics. A cluster consists of one or more subnets interconnected by routers.
FIG. 1 shows an example Infiniband cluster of a single subnet. This subnet consists of four hosts, 10, 12, 14 and 16. These hosts are interconnected via switches 20, 22 and 24. The hosts may also be connected to I/O enclosures 26 and 28 via switches 20, 22 and 24. Hosts and I/O devices are connected to the switches via one or more channel adapters 18.
A subnet is a collection of systems, I/O enclosures, and switches which are managed by a single management entity called a subnet manager. An Infiniband compliant subnet requires at least one subnet manager. The subnet manager may reside at a host, switch, or I/O enclosure. The subnet manager discovers fabric topology, assigns unique addresses to all channel adapter ports that are connected to the fabric, programs switch forwarding tables, and prepares all fabric connected agents so that they can communicate with other fabric agents. Apart from basic initialization services, the subnet requires other services to be present for it to be functional, e.g., a path service that provides information about how to reach fabric attached agents; a device management service that enumerates I/O controllers; a device configuration service that assigns I/O controllers to host; a baseboard management service that allows management of devices beyond a channel adapter; and in addition, particular implementations may need vendor specific services implemented in the subnet.
Services are implemented by logically independent entities called service agents (SA). Service agents are invoked as needed by clients running on subnet hosts. To request a service from a service agent, a client needs to find out the subnet address at which it should direct the service request, and a client needs to find out the queue pair (QP) at which to send the service request. The Infiniband architecture does not define subnet addresses that provide specific services. The only address known to all clients is the address at which the subnet manager resides. When the subnet manager initializes a channel adapter, it registers the subnet manager address with the channel adapter. In order to distribute the load, it is desirable that different service agents be allowed to run on different hosts using different QPs.
The fundamental problem with installing service agents on a host other than the subnet manager is that the only universally known address by clients on the subnet is the address of the subnet manager. Therefore, all service requests need to be directed to the QP on the subnet manager. If a service agent is installed on a host other than the subnet manager, there is no way for the service agent to notify potential clients of its presence. The Infiniband architecture specification has no defined mechanism by which a service agent running on any arbitrary host can register with the General Services Agent (GSA) at the universally known subnet manager address. In the absence of this mechanism, there are two possibilities for implementing service agents on non-subnet manager hosts.
First, install all service agents on the subnet manager host. Since this address is universally known, all clients can issue requests to it. However, this is not efficient since the subnet manager may become a bottle neck if a large number of clients are trying to access the services of a large number of service agents, all of which are implemented on the same host.
Second, install a real service agent on any arbitrary host and install a separate stub service agent on the subnet manager host. The stub service agent on the subnet manager discovers the real service agent using a proprietary mechanism. When a client issues a service request to the QP on the subnet manager, this stub service agent can redirect the client to the real service agent on another host. This is also inefficient since all services need to have separate stub service agents and real services agents. In addition, each stub service agent may have to implement proprietary mechanisms to locate and communicate with the real service agent.