1. Field of the Invention
The present invention is related to multiprocessor computer systems, and more particularly to a system and method for routing packets in a multiprocessor computer system.
2. Background Information
The interconnection network plays a critical role in the cost and performance of a scalable multiprocessor. It determines the point-to-point and global bandwidth of the system, as well as the latency for remote communication. Latency is particularly important for shared-memory multiprocessors, in which memory access and synchronization latencies can significantly impact application scalability, and is becoming a greater concern as system sizes grow and clock cycles shrink.
Over the past 15 years the vast majority of interconnection networks have used low-radix topologies. Many multiprocessors have used a low-radix k-ary n-cube or torus topology, including the SGI Origin2000 hypercube, the dual-bristled, sliced 2-D torus of the Cray X1, the 3-D torus of the Cray T3E and Cray XT3, and the torus of the Alpha 21364. The Quadrics switch uses a radix-8 router, the Mellanox router is radix-24, and the highest radix available from Myrinet is radix-32. The IBM SP2 switch is radix-8.
A low-radix fat-tree topology was used in the CM-5, and this topology is also used in many clusters, including the Cray XD1.
During the past 15 years, the total bandwidth per router has increased by nearly three orders of magnitude, due to a combination of higher pin density and faster signaling rates, while typical packet sizes have remained roughly constant. As noted by J. Kim, et al. in “Microarchitecture of a high-radix router,” ISCA '05: Proceedings of the 32nd Annual International Symposium on Computer Architecture, pages 420-431, Madison, Wis., USA, 2005. IEEE Computer Society, this increase in router bandwidth relative to packet size computer designers to build networks built from many thin links rather than fewer fat links as in the recent past. Kim concludes that building a network using high-radix routers with many narrow ports reduces the latency and cost of the resulting network.
What is needed is a system and method for efficiently routing packets through a multiprocessor computer system.