A well known topology for massively parallel computer data networking systems, is the 3D torus. a 3D torus is generally a cubic grid of compute nodes that has a ring network at every level. Supercomputing massively parallel systems, such as the system described in the Provisional Application Ser. No. 60/271,124, use the 3D torus topology to provide the minimal path route, i.e. the shortest path for communications between hundreds or thousands of nodes. One problem with this topology in a massively parallel system is the inefficient delivery of messages over an interconnection network, particularly when Ethernet or Asynchronous Transfer Mode (ATM) switches are used.
More specifically for example, Ethernet or ATM switches do not generally provide low latency, high throughput, and error free delivery of packets, since these switches typically lose packets if there is not enough buffer space, i.e., holding areas for input and output processing, to hold the packet. Additionally, the problem of contention, i.e., a conflict that arises when two or more requests are made concurrently for a resource that cannot be shared, such as a communication link, must be overcome if the switching network is to be scalable to the size of tens of thousands of nodes.
Typically, contention issues have been dealt with by employing some sort of arbitration algorithm which mediates which transmitters on a network can transmit packets subsequent to a packet collision detection. Unfortunately, the related art has not addressed the need for a collision detection/arbitration method which is ultra-scalable, thus suitable for massively parallel systems. Additionally, current routing techniques are not suitable for scaling up to massively parallel systems because the routers typically have tables that must be maintained. Overhead for table maintenance becomes unduly burdensome as the number of nodes reaches the tens of thousands.
As stated above, the three-dimensional (3D) torus topology is known. For example, the Cray T3E uses this 3D torus topology. However, the Cray uses routing tables stored in each switch element, an approach that does not scale well to tens of thousands of nodes. Other known technologies are the “Bubble” escape virtual channels (VC's), (Puente et al., “Adaptive Bubble Router: A Design to Balance Latency and Throughput in Networks for Parallel Computers”, In Proceedings of the International Conference on Parallel Processing, ICPP '99, September 1999), which provide fully dynamic routing that does not require routing tables.
Another known technique is the use of multiple virtual channels to reduce “head-of-line” blocking, as employed in the SP2 and the Cray computers. The use of a two-stage arbitration approach has been taught by the MIT Reliable Router (William J. Dally, Larry R. Dennison, David Harris, Kinhong Kan, and Thucydides Xanthoppulos, “Architecture and Implementation of the Reliable Router,” In Proceedings of HOT Interconnects II, pp. 122-133, August 1994).
Another related art technology uses virtual cut-through routing in an attempt to optimize throughput and latency. See P. Kermani and L. Kleinrock entitled “Virtual Cut-Through: A New Computer Communication Switching Technique”, Computer Networks, Vol. 3, pp 267-286, 1979, incorporated herein by reference.
However, the related art references do not adequately solve the problem of packet contention and queueing delays along a selected packet direction of travel and virtual channel, particularly when a switch is scaled up to handle tens of thousands of nodes.
In a related disclosure, U.S. Provisional Application Ser. No. 60/271,124 entitled “A Novel Massively Parallel Supercomputer”, therein is described a semiconductor device with two electronic processors within each node of the multi-computer. Within the supercomputer, there is a plurality of high-speed internal networks, and an external network employing Ethernet. These networks are expected to service over 64,000 nodes.
While there is no known prior art that attempts to scale a network switch to tens of thousands of nodes for fast, error-free operation, there remains the need for a scalable arbitration method that enables error-free, “low latency, high bandwidth (throughput) data communications, to enhance the message-passing capability for a massively parallel system.