1. Field of the Invention
This invention generally relates to the field of high-speed digital data processing systems; and more specifically, the invention relates to methods and systems for routing messages in computer systems.
2. Background Art
Massively parallel computer systems comprise a large number of data processing elements, which are typically connected using a network. Each node connected to the said network typically is comprised of a network interface and the local data processing elements. The network interface receives data from the network, which is addressed to this particular node, and the network interface also injects the local results into the network. Data is typically routed through the network in packets; and the packets are routed by a plurality of routers, typically one router per node. The network, specifically the plurality of the network routers, ensures the movement of the injected packets between the connected nodes towards the desired packet destinations.
Typically, the node, which produces a data packet, specifies the desire destination of that packet by specifically providing a unique address of the said packet destination. Upon injection of such an attributed packet, the plurality of network routers make local routing decisions to incrementally reduce the distance of the packet to its destination by forwarding the packet to a connected node closer to the specified destination. This universal point-to-point style of communication is state of the art and used by most of today's implemented computer networks. The drawback of using addresses as part of the packet attributes is the limitation of the network scalability to the maximal number of addresses presentable with the bits dedicated to the packet address.
Furthermore, additional auxiliary networks have been used to implement special support for collective communication such as global broadcasts to all connected nodes (CM-5). These networks have typically the topology of a tree or a fat tree, since the tree topology provides the minimal distance between any two connected nodes and, thus, minimal communication latency.
There are several constraints imposed to particular nodes by the tree topology. For example, the dedicated root node splits the network into two domains, left and right. Traffic from one domain targeted to the other domain must go through the root node under any circumstances. A broken root-node, router and/or links, will render the entire network useless since no packets can be routed from the left to the right partition. In addition, leaf nodes in a tree network have only one connection to the network. If this link is broken, the entire network is also not functional anymore.