This invention relates generally to digital networks for interconnecting multiple users, and more particularly the invention relates to a multi-fabric topology for interconnecting nuilti-port user nodes.
Fault tolerant computer systems typically run in a multiprocessor environment in which computers can operate in parallel with one or more levels of redundancy. Such a system is described in U.S. Pat. No. 5,751,932 for xe2x80x9cFail-Fast, Fail-Functional, Fault-Tolerant Multiprocessor Systemxe2x80x9d, assigned to Tandem Computers Incorporated, now Compaq Computer Corporation. Central Processing Units (CPUs) in this system operate in pairs with each user node having an X CPU with an X port and a Y CPU with a Y port. The X ports of all X processors are interconnected by an X fabric comprising a first topology of multi-port switches, and the Y ports of all Y processors are interconnected by a Y fabric comprising a second topology of multi-port switches. The patent describes the use of xe2x80x9cTNetxe2x80x9d Links comprising two uni-directional 10-bit sub-link busses connecting the X port and the Y port to the multi-port switches.
Tandem has introduced also a ServerNet-II cluster computing system consisting of network interface cards (NICs), 12-port crossbar routers (switches), and interconnecting links.
The switches and a switch topology for cluster computing are described in xe2x80x9cServerNet-II: a reliable interconnect for scalable high performance cluster computingxe2x80x9d, Heirich, Garcia, Knowles, and Horst, Compaq Computer Corporation, Tandem Division, Sep. 21, 1998. As there described, the Server Net System Area Network (SAN) is a scalable interconnect technology designed as the primary interconnect for high availability information processing systems. These systems are characterized by round the clock availability in high profile locations where they support online transaction processing, telecommunications, internet service providers, and other applications. The ServerNet-II (SAN) achieves its high level availability by incorporating fault tolerant mechanisms at every architectural level. Failures in routing nodes that could impact the interconnect fabric are detected and isolated through self-checking logic. Failures in network links or routing elements are retried or re-routed along an alternate path through the fabric.
Heretofore, the multiple fabrics (X, Y) interconnecting multiple processors have generally had the same topologies with an equal number of switches interconnected together and with users in identical networks. The performance of each fabric or interconnect network can be defined in terms of inter-node distances, number of switch components, and data transfer capacity or bisection bandwidth. The bisection width is equal to the number of links in the weakest fabric bisection. While number of router hops does not significantly alter message latency in wormhole-routed networks, it does lower link occupancy and average link contention. The present invention is directed to enhancing network performance by employing asymmetric fabric topologies in a multi-fabric environment.
The invention is directed to a method of structuring a switch network having at least two groups of multi-port switches and the resulting network for interconnecting multi-port user nodes to enhance network performance such as increased bisection bandwidth, reduced inter-node distances, or reduced number of switches.
A first switch group is provided having a first plurality of multi-port switches interconnected with a first plurality of user ports whereby the first plurality of user ports are interconnected through one or more of the first plurality of switches. A second switch group is then provided which has a second plurality of multi-port switches interconnected with a second plurality of user ports whereby the second plurality of user ports are interconnected through one or more of the second plurality of switches. The first switch group and the second switch group are asymmetrical with at least one path between two nodes being of different length in the two groups of switches. By careful introduction of asymmetry between the switch groups, network performance is enhanced.
In the environment with two port (X,Y) user nodes, two switch groups are provided with the X ports of all user nodes being connected to the X switch group and the Y ports of all user nodes being connected to the Y switch group, each switch group comprising a plurality of multi-port crossbar switches. All of the plurality of X ports are interconnected through the X switch group and all the plurality of Y ports are interconnected through the Y switch groups. In structuring the asymmetry between switch groups, at least one distance -n (i.e., n-hop), a path between nodes) in the X switch group is not a distance -n in the Y switch group. Alternatively, or in addition thereto, the X switch group can have a different number of switches than has the Y switch group. In the embodiments where the switch groups are not interconnected, each switch group constitutes an independent fabric.