Connecting computers in a cluster is an important way of increasing the computational power available to solve large problems. An important part of a cluster is the underlying connectivity network. Ideally, an all connect network which allows any combination of computers to talk to each other is required. That is, if ‘N’ computers are labeled {0, 1, . . . N−1}, the connectivity network can support any permutation. Having an all connect network is prohibitively expensive for large N as it will require N*(N−1)/2 connections. Hence, the quest for alternative network connections.
When there are k<N−1 connections per node (N node cluster) it may take several hops to communicate data between nodes that are not directly connected. In this invention, a method to connect nodes such that the distance between any two nodes is at most 2 edges is described using the concept of difference sets to create optimum 2-hop computer networks.
Similarly, creating a parallel shared memory computer is an important way of increasing the computational power. As one increases the number of processors which access a large shared memory the performance improvement is often limited by bandwidth at which the memory can supply data to the multiple processors.
A method for creating a network where the maximum distance between any two nodes is 2 is now described. Consider a cluster of N=13 nodes, where the nodes are numbered {0, 1, . . . , 12}. Connect the nodes such that node i is connected to node (i+1) mod N. This creates a cycle of N nodes. Nodes 1 and 12 are 1 hop away and nodes 2 and 11 are 2 hops away. The remaining nodes are more than 2 hops away. Next, add another cycle with a stride of 4. Since N=13 is relatively prime to 4 a full cycle of N nodes is obtained, 4 is selected because 4 and its neighbors 3 and 5 were farther than 2 hops from 0. With this connection the maximum number of nodes that will now be within 2 hops of node 0 are added. The TABLE 1 below lists the nodes that can be reached from node 0 in 1 and 2 hops. The connectivity pattern for all nodes is identical; therefore the various properties that are observed for node 0 are true for all other nodes.
TABLE 1Cluster connections with 2 cyclesCYCLE 00123456789101112CYCLE 10481237112610159
TABLE 21 and 2 Hop neighbors of Node 01-HOP NEIGHBORS11249OF NODE 02-HOP NEIGHBORS2/1, 5/1,11/12, 8/12,8/4, 3/4,8/9, 10/9,OF NODE 010/13/125/45/9
The TABLE 2 above lists the four 1-hop and six 2-hop neighbors of Node 0. The intermediate nodes of the 2 hop neighbors are also listed. Thus, to reach node 2 from node 0 a move of distance of 2 on Cycle 0 is to be performed. The intermediate node in this case is 1. The 1 hop neighbors are arrived at by moving a distance of 1 on each of the cycles (Cycle 0 and Cycle 1) in the two directions (Clockwise and Anticlockwise). The 2 hop neighbors are arrived at by moving on Cycle 0 or Cycle 1 or both in either direction for a total distance of 2. However, with these 2 cycles nodes 6 and 7 are still present that are more than 2 hops away from node 0. To achieve the two hop connectivity to these nodes another cycle of stride 6 is added. TABLE 3 below lists the nodes on a cycle of stride 6. Both node 6 and node 7 are 1 hop away from node 0.
TABLE 33rd Cycle to ensure 2-Hop connectivity for 13 Node clusterCYCLE 20612511410392817
The above method describes an outline of how to create a 2-hop cluster networks. It is possible to create 2-hop networks using Singer Difference Sets. Particularly, the paper presented by Parhami, B et. al [published in Parallel and Distributed Systems, IEEE Transactions on Volume 16, Issue 8, August 2005 Page(s): 714-724] and the PCT Patent Application 97/34239 uses this method to create computer networks. Singer Difference sets were defined in the previous section. It is also possible to create 2-hop networks using Projective Geometry. Singer described in a paper [published May 1938 in Transactions of the American Mathematical Society, Vol 43, No. 3, pp. 377-385 titled A theorem in Finite Projective Geometry and Some Applications to Number Theory by James Singer] that there exists a 1-1 isomorphism between difference Sets and Projective Geometry points. In 1992, Narendra Karmarkar described in a paper [paper appears in Application Specific Array Processors, 1992. Proceedings of the International Conference on Publication Date: 4-7 Aug. 1992 On page(s): 64-80] that projective geometry points and lines (also planes/hyper planes etc.) can be used to create efficient computer networks.
There are known procedures to compute difference sets for small values of n. The TABLE 4 below reproduces a sample collection of difference sets for n=2, 3, 4, 5, 7, 8, 9, 11 from the paper by James Singer (1938) referred above. The method in this invention is described for Singer Difference set, however it will work for the difference set as defined in the earlier section.
TABLE 4Sample Difference set D for n = 2, 3, 4, 5, 7, 8, 9, 11nNDifference Set Elements270133130139 22210141416531013812187570131332364352 237301371531365463 3391013927495661778111 13301312203438818894104109
A standard method to create a cluster based on Singer difference set is described below. Let N=n2+n+1 be the size of G and n the order of difference set D. Then create a cycle of stride di where diD and di≠0 over nodes in the Group D. Thus for 13 nodes, three cycles of stride 1, 3, and 9 are created. For 31 nodes, five cycles of stride 1, 3, 8, 12 and 18 are created. The TABLE 5 below shows Singer Cycles over 13 nodes.
TABLE 5Cluster of 13 nodes connected using 3 Singer CyclesSINGER0123456789101112CYCLE 0SINGER0369122581114710CYCLE 1SINGER0951106211731284CYCLE 2
In one scheme (Parhami 95) each node represents a computer with a switch. The 13 nodes are connected with 3 cycles as shown in TABLE 5. This results in each node having 6 connections (2 per cycle). A node k is connected to node k+di and k+(N−di) where diD and di≠0. Thus, the degree of connections is 2*n. In another scheme (Parhami 95) uses a bipartite graph to model the connections. The computer is represented by a node and a switch by another node. There are N computer nodes and N switch nodes. Computer k is connected to switch k. Switch k connects to computers k+di where diD and di≠0. This results in the switch k having n connections (corresponding to k+diD and di≠0) plus a connection to its computer and the computer having n connections (corresponding to k+N−diD and di≠0). Thus, a single connection to 2*n nodes is split into 2 connections to n nodes of the bipartite graph. Thus, the total connections of the computer and switch module is still 2*n. This scheme is equivalent to the following scheme. Let there be a computer at each node k, 0<=k<=N−1. Connect two switches with n external connections (s0 and s1) to each computer node k. Now connect switch s0 at node k to switch s1 at k+diD and di≠0 Thus, the degree of the switch can be reduced from 2*n to n, but the two switches (s0 and s1) of degree n will be needed. However, the fundamental problem is the lack of symmetry when using Difference Sets directly to create computer Networks.
A different scheme is proposed by Narendra Karmarkar in which there are N processors and N memories. Processor i connects to memory i+di where di is an element in the difference set. This means processor i connects to memories in column i+1. Referring to TABLE 5 there is processor 0 which connects to memories 1, 3, 9. In addition, processor 12, 10, 4 connects to memory 0. If there is an architecture where processors and memories are different units then a cluster can be built such that p connections are needed from the memory or processor units. However, in practice processor and memories are present in the same unit. Thus, if processor 0 and memory 0 represent the same node, then the node 0 is connected to 1, 3, 9 because computer 0 needs to connect to memories 1, 3, and 9. Similarly node 0 is connected to 12, 10, 4 because computer 12, 10, and 4 needs to connect to memory 0. Thus, this scheme will also results in 2*3=6 connections per node.
Thus, it is seen that because the method and structure disclosed in the prior art is not symmetric the degree of connections per node becomes 2*n. Therefore, there is a need for a cluster network where the network is symmetric and there are reduced numbers of connections required for the network. In addition, there is a need for a scalable clustered network which is cost-effective and ensures conflict-free communication between the nodes of the cluster.