Interconnect network technology is a fundamental component of computational and communications products ranging from supercomputers to grid computing switches to a growing number of routers. However, characteristics of existing interconnect technology result in significant limits in scalability of systems that rely on the technology.
For example, even with advances in supercomputers of the past decade, supercomputer interconnect network latency continues to limit the capability to cost-effectively meet demands of data-transfer-intensive computational problems arising in the fields of basic physics, climate and environmental modeling, pattern matching in DNA sequencing, and the like.
For example, in a Cray T3E supercomputer, processors are interconnected in a three-dimensional bi-directional torus. Due to latency of the architecture, for a class of computational kernels involving intensive data transfers, on the average, 95% to 98% of the processors are idle while waiting for data. Moreover, in the architecture about half the boards in the computer are network boards. Consequentially, a floating point operation performed on the machine can be up to 100 times as costly as a floating point operation on a personal computer.
As both computing power of microprocessors and the cost of parallel computing have increased, the concept of networking high-end workstations to provide an alternative parallel processing platform has evolved. Fundamental to a cost-effective solution to cluster computing is a scalable interconnect network with high bandwidth and low latency. To date, the solutions have depended on special-purpose hardware such as Myrinet and QsNet.
Small switching systems using Myrinet and QsNet have reasonably high bandwidth and moderately low latency, but scalability in terms of cost and latency suffer from the same problems found in supercomputer networks because both are based on small crossbar fabrics connected in multiple-node configurations, such as Clos network, fat tree, or torus. The large interconnect made of crossbars is fundamentally limited.
A similar scalability limit has been reached in today's Internet Protocol (IP) routers in which a maximum of 32 ports is the rule as line speeds have increased to OC192.
Many years of research and development have been spent in a search for a “scalable” interconnect architecture that will meet the ever-increasing demands of next-generation applications across many industries. However, even with significant evolutionary advancements in the capacity of architectures over the years, existing architectures cannot meet the increasing demands in a cost-effective manner.