1. Field of the Invention
The present invention is directed toward a hypercube topology, and more specifically, to a hierarchical fat hypercube topology for connection of processing nodes and input/output (I/O) nodes in a multi-processor environment. The topology according to the present invention provides expandability, scalable bisection bandwidth, redundant paths between node pairs, and a low network diameter.
2. Related Art
Recent advances in VLSI Technology have led to the evolution of a processing environment in which large numbers of processors can be implemented in massively parallel computing systems. A cardinal element that dictates the performance of such a massively parallel computing system is the infrastructure that supports communications among the various processors. In fact, given contemporary design criteria such as high computational bandwidth and increased parallelism, the level of communications required among processors is increasing dramatically.
Several different topologies have been proposed to interconnect the various processors in such environments. Among these topologies are rings, stars, meshes and hypercubes. Regardless of the topology chosen, design goals include a high communications bandwidth, a low inter-node distance, a high network bisection bandwidth, and a high degree of fault tolerance.
Inter-node distance is defined as the number of communications links required to connect one node to another node in the network. Topologies are typically specified in terms of the maximum inter-node distance: the shortest distance between two nodes that are farthest apart on the network. The maximum inter-node distance is also referred to as network diameter.
Bisection bandwidth is defined as the number of links that would be severed if the network were to be bisected by a plane at a place where the number of links between the two halves is a minimum. In other words, bisection bandwidth is the number of links connecting two halves of the network where the halves are chosen as the two halves connected by the fewest number of links. It is this worst-case bandwidth which can potentially limit system throughput and cause bottlenecks. Therefore, it is a goal of network topologies to maximize bisection bandwidth.
In this document, bisection bandwidth is often described in terms of the number of nodes in the network. This enables comparison of the relative bisection bandwidths of networks of various sizes. For a network having k nodes, bisection bandwidth is defined in terms of the number of nodes as x*k. For example, as described below, a conventional hypercube has a bisection bandwidth of (1/2)k. This bisection bandwidth remains constant (relative to the number of nodes) at k/2 regardless of the dimension of the conventional hypercube.
Note that it may be more appropriate to define bisection bandwidth as the number of communications links times the bandwidth of each link. However, assuming a constant bandwidth/link regardless of the topology, relative bisection bandwidth comparisons among the topologies can simply be addressed in terms of the number of links. Therefore, as a matter of convention, bisection bandwidth is defined in this document in terms of the number of communications links.
One multi-processor architecture that meets these design criteria and is well suited to applications requiring a large number of processors is the hypercube. A conventional hypercube topology is now described. In a hypercube network, a plurality of microprocessors are arranged in an n-dimensional cube where the number of nodes k in the network is equal to 2.sup.n. In this network, each node is connected to each other node via a plurality of communications paths. The network diameter, the longest communications path from any one node on the network to any other node, is n-links.
FIGS. 1A, 1B, 1C, and 1D illustrate 1, 2, 3, and 4 dimensional hypercubes, respectively. Referring now to FIGS. 1A-1D, the hypercube comprises a plurality of nodes 102 connected to one another via edges 104 (i.e links). As stated above, each n-dimensional hypercube has a plurality of nodes, where the number of nodes k is equal to 2.sup.n. For example, the 4-dimensional hypercube, illustrated in FIG. D as a tesseract, has 24, or 16, nodes 102. Each node is connected to n=4 other nodes 102 (i.e., each node 102 is connected to n edges 104), and the longest path between any two nodes 102 is n=4 links (edges 104).
One feature of the conventional hypercube is that the maximum distance between any two nodes (i.e., the diameter) in a hypercube having k nodes is given by log.sub.2 (k). Thus, even as the number of nodes increases, the maximum distance between any two nodes only increases as log.sub.2. As a result, the number of nodes, and hence the number of processors or I/O ports, can be doubled while only requiring a unitary increase in the network diameter between any two nodes. Thus, for each increase in the dimension of the topology (i.e., to expand from an n-dimensional to an (n+1)-dimensional hypercube), an additional edge 104 must be connected to each node 102. Thus, it is axiomatic that to increase the dimension of a hypercube 100, each node must have an additional port to support the additional connection. As a result, as the dimension of the hypercube increases the number of ports in each node increases as well.
Another advantage of the conventional hypercube is a high bisection bandwidth. As stated above, bisection bandwidth is defined as the number of edges that connect two halves of the network when the network is bisected at the weakest point. With a conventional hypercube, the bisection bandwidth always remains constant at k/2, for a hypercube having k nodes.
Several extensions or variations of the conventional hypercube have been proposed and/or implemented in multi-processor systems. One such variation, presented in Scalable optical hypercube-based interconnection network for massively parallel computing, by Loud, et at., Applied Optics, Vol. 33, No. 32, 10 Nov. 1994, pp 7588-7598, is the multi-mesh hypercube. One disadvantage of the multi-mesh hypercube over the conventional hypercube is a large network diameter. For a multi-mesh hypercube made up of an l.times.m array of n-dimensional hypercubes, the maximum distance is ((l-1)+(m-)+n). A second disadvantage of the multi-mesh hypercube topology is a low bisection bandwidth relative to the number of nodes in the network. For a symmetrical mesh (where l=m), the bisection bandwidth is k/4, where k is the number of nodes.
A second variation on the conventional hypercube is presented in The Hierarchical Hypercube: A New Interconnection Topology for Massively Parallel Systems, by Malluhi, et al., IEEE Transactions on Parallel and Distributed Systems, vol. 5, No. 1, January 1994, pp. 17-30. According to Malluhi, each of the nodes of an n-dimensional hypercube is itself an n-dimensional hypercube. This topology has two disadvantages over the conventional hypercube: a lower bisection bandwidth and a greater maximum distance for the same number of nodes.
Consider for example a three-dimensional hierarchical hypercube according to Malluhi, where each node is itself a three-dimensional hypercube (i.e., n=3, n'=3). In such a network, there are 64 nodes, and the bisection bandwidth is 4 edges, or k/16. Contrast this to the conventional hypercube having 64 nodes with a bisection bandwidth of 32 edges, or k/2. Also, Malluhi's hierarchical hypercube has a maximum diameter of (n+2n'), which, for the n=3 and n'=3 network of the current example yields a maximum diameter of nine edges. For a conventional 64-node hypercube, the maximum diameter is log.sub.2 (64)=six edges.
A third variation of the conventional hypercube is presented in Extended Hypercube: A Interconnection Network of Hypercubes, by Kumar et al., IEEE Transactions on Parallel and Distributed Systems, vol. 3, no. 1, 1 Jan. 1992, pp. 45-57. According to this extended hypercube topology, network commanders (NC's) are used to connect a plurality of n-dimensional hypercubes. There are 2.sup.n nodes at each n-dimensional hypercube at the first level, one n-dimensional cube of NC's at the second level, and one NC at the third level.
Thus, according to the extended hypercube topology, each node of an n-dimensional hypercube at one level of hierarchy is connected to a single communication processor at the next level of hierarchy. A plurality of n-dimensional hypercubes, each having an NC are connected via the NC's. One disadvantage of this is that each NC provides a single point of failure for its respective hypercube. If the NC fails, the entire hypercube is severed from the network. Another disadvantage is that apparently additionally processing capabilities are needed at the NC nodes. It does not appear that conventional routers can be used to implement the NC. Furthermore, each NC requires a large number of edge connections.
Conventional hypercube topology is a very powerful topology that meets many of the system design criteria. However, when used in large systems, the conventional hypercube has some practical limitations. One such limitation is the degree of fanout required for large numbers of processors. As the degree of the hypercube increases, the fanout required for each node increases. As a result, each node becomes costly and requires larger amounts of silicon to implement.
The variations on the basic hypercube topology, as noted above, each have their own drawbacks, depending on the size of the network. Some of these topologies suffer from a large network diameter, while others suffer from a low bisection bandwidth. What is needed is a topology that is well suited to applications requiring a large number of processors; is scalable; and provides a high bisection bandwidth, a wide communications bandwidth, and a low network diameter.