Existing networking and interconnect technologies have failed to keep pace with the development of computer systems, resulting in increased burdens being imposed upon data servers, application processing and enterprise computing. This problem has been exasperated by the popular success of the Internet. A number of computing technologies implemented to meet computing demands (e.g., clustering, fail-safe and 24×7 availability) require increased capacity to move data between processing nodes (e.g., servers), as well as within a processing node between, for example, a Central Processing Unit (CPU) and Input/Output (I/SO) devices.
With a view to meeting the above described challenges, a new interconnect technology, called the InfiniBand™, has been proposed for interconnecting processing nodes and I/O nodes to form a System Area Network (SAN). This architecture has been designed to be independent of a host Operating System (OS) and processor platform. The InfiniBand™ Architecture (IBA) is centered around a point-to-point, switch fabric whereby end node devices (e.g., inexpensive I/O devices such as a single chip SCSI or Ethernet adapter, or a complex computer system) may be interconnected utilizing a cascade of switch devices. The InfiniBand™ Architecture is defined in the InfiniBand™ Architecture Specification Volume 1, Release 1.0, released Oct. 24, 2000 by the InfiniBand™ Trade Association. The IBA supports a range of applications ranging from back plane interconnects of a single host, to complex system area networks, as illustrated in FIG. 1A (prior art). In a single host environment, each IBA switch fabric may serve as a private I/O interconnect for the host providing connectivity between a CPU and a number of I/O modules. When deployed to support a complex system area network, multiple IBA switch fabrics may be utilized to interconnect numerous hosts and various I/O units.
Within a switch fabric supporting a System Area Network, such as that shown in FIG. 1A, there may be a number of devices having multiple input and output ports through which data (e.g., packets) is directed from a source to a destination or target. Such devices include, for example, switches, routers, repeaters and adapters (exemplary interconnect devices). Where data is processed through a device, it will be appreciated that multiple data transmission requests may compete for resources of the device. For example, where a switching device has multiple input ports and output ports coupled by a crossbar, packets received at multiple input ports of the switching device, and requiring direction to specific outputs ports of the switching device, compete for at least input, output and crossbar resources.
In order to facilitate multiple demands on device resources, an arbitration scheme is typically employed to arbitrate between competing requests for device resources. Such an arbitration schemes may typically be either (1) a distributed arbitration scheme, whereby the arbitration process is distributed among multiple nodes, associated with respective resources, through the device or (2) a centralized arbitration scheme whereby arbitration requests for all resources is handled at a central arbiter. An arbitration scheme may further employ one of a number of arbitration policies, including a round robin policy, a first-come-first-serve policy, a shortest message first policy or a priority based policy, to name but a few.
The physical properties of the IBA interconnect technology have been designed to support both module-to-module (board) interconnects (e.g., computer systems that support I/O module add in slots), chassis-to-chassis interconnects, as to provide to interconnect computer systems, external storage systems, and external LAN/WAN access devices. For example, an IBA switch may be employed as interconnect technology within the chassis of a computer system to facilitate communications between devices that constitute the computer system. Similarly, an IBA switched fabric may be employed within a switch, or router, to facilitate network communications between network systems (e.g., processor nodes, storage subsystems, etc.). To this end, FIG. 1A illustrates an exemplary System Area Network (SAN), as provided in the InfiniBand™ Architecture Specification, showing the interconnection of processor nodes and I/O nodes utilizing the IBA switch fabric.
In a network communication scheme (e.g., IBA switch fabric) there are typically multiple targets or destinations that request for information, data, or instructions that are transferred in some kind of format, (e.g., packets) from sources to these targets or destinations. A destination could be an ordinary client machine that is an I/O controller connected to the communication network. The destination may have several addresses, for examples, local identifiers (LIDs) that are dedicated to receiving packets within the network communication. In this scheme, a source may be a server, also connected to the communication network.
It is well understood that there are many destinations requesting information from servers, and many servers responding to different destinations at any one time in the communication network The shifting of information around, from one place to another, almost always occurs simultaneously within the switch fabric. The requests for information occur in any direction from any destination to any server and vice versa. For instance, when the server receives the request for a particular information from the requested destination, it presents a request to an arbiter in the communication network practicing the arbitration scheme mentioned above to request access to certain ports so that it can route the information to the requested destination. The requested destination may have more than one LIDs through which packets can be directed (as can be seen in FIG. 1B (prior art)). For instance, a destination 2 in the network communication has been assigned with four (4) LIDs, LID8, LID9, LID10, and LID11, (i.e., four paths through which data can be routed to the destination 2, P8, P9, P10, and P11,). Similarly, a destination 1 has been assigned with four LIDs, LID1, LID2, LID3, and LID4 (i.e., four paths through which data can be routed to the destination 1, P1, P2, P3, and P4,).
Returning to FIG. 1A, as can be imagined, traffic and congestion in the ports, especially as the Internet grows more and more crowded everyday, are inevitable. For instance, there may be multiple servers and destinations wanting to go through the same ports to get information to some destinations or destinations. Hot spots, or congestions in the ports, are thus even more typical with the expansion of the Internet. Alternative routing of the packets has been practiced in the field as one way to alleviate the congested port problem. Conventional alternative routing may be thought of as routing the data through different ports, the ports that are available. The IBA itself provides one way to do it.
For example, a destination can have two LIDs assigned to it as shown in FIG. 1A. Each LID is recorded in a routing table in a switch in the IBA's switch fabric, which enables the determination of the port through which the data can be routed for these LDs. When there is alternative routing, it is purely a concept between the source and the destination. In this example, the network is configured with multiple routing paths for any particular destination. There are thus more than one paths to get to the destination. (e.g., paths P1, P2, P3 and P4, for the destination 1). When there are failed links in the switch fabric, (e.g., P3 is congested or disabled), the server can route the packet through the other alternative paths (e.g., paths P1, P2, or P4,) to get the data to the destination 1.
In another example, a Sub-Network manager or the manager for the switch fabric can reload the routing table in the affected area of the network. For instance, when the switch 4 is disabled for some reason, the Sub-Network manager can reload the routing table F-5 in the switch 5 to include LID 13. The packet can then be routed to the destination 1 via the switch 5.
However, as seen, in order for packets to be transferred at all, the LIDs in the packets must match with at least one of the LIDs in the routing table of a switch. The alternative routing of the current art is simply a mechanism such that the routing path for data transfers in the communication network will be chosen through the paths that are available and less congested. However, only those switches that have the LIDs for the destination can participate in this alternative routing mechanism When all of those switches are unavailable, there will be delay in the routing. If the Sub-Network manager does not reestablish a new passage to the destination node by reprogramming some of the routing tables in the switch fabric, there could even be permanent blockage of the access to the destination node in every possible routing path.