Interconnection networks are commonly used in many different applications, such as connection of internal lines in very large-scale integration (VLSI) circuits, wide area computer networks, backplane lines, system area networks, telephone switches, internal networks for asynchronous transfer mode (ATM) switches, processor/memory interconnects, interconnection networks for multicomputers, distributed shared memory mulitprocessors, clusters of workstations, local area networks, metropolitan area networks, and networks for industrial applications, as noted in page 1 of the book entitled xe2x80x9cInterconnection Networks An Engineering Approachxe2x80x9d by Jose Duato, Sudhakar Yalamanchili and Lionel Ni, published by IEEE Computer Society Press, Los Alamitos, Calif. 1997, which book is incorporated by reference herein in its entirety.
Examples of ATM switches are described in, for example, U.S. Pat. Nos. 5,898,688, 5,898,687, 5,734,656, 5,726,985, 5,668,812, and 5,610,921 each of which is incorporated by reference herein in its entirety. Moreover, the following three books: (1) by A. S. Acampora, entitled xe2x80x9cAn Introduction to Broadband Networksxe2x80x9d published by Plenum Press, in 1994, (2) by R. Perlman, entitled xe2x80x9cInterconnections: Bridges and Routersxe2x80x9d published by Addison-Wesley, in 1992, and (3) by P. E. Green, entitled xe2x80x9cNetwork Interconnection and Protocol Conversionxe2x80x9d published by IEEE Press, in 1988 are each incorporated by reference herein in their entirety.
One group of interconnection networks, known as xe2x80x9cmultistage interconnection networksxe2x80x9d (MINs) connect input ports of a network to output ports through a number of stages of switches, where each switch is a crossbar network. The crossbar network is normally configured by a central controller that establishes a path from an input port to an output port. However, in asynchronous multiprocessors, centralized control and permutation routing are infeasible, and a routing algorithm is used to establish a path across the multiple stages, as noted in the above-described book at pages 19-20.
Also as noted in the above-described book at page 19, although many MINs have an equal number of input and output ports, xe2x80x9c[t]hese networks can also be configured with the number of inputs greater than the number of outputs (concentrators) and vice versa (expanders).xe2x80x9d One example of such an MIN uses crossbar routing switches having an equal number of inputs and outputs, although the number of logically equivalent outputs in each direction can be greater than one, as described in an article entitled xe2x80x9cMultipath Fault Tolerance in Multistage Interconnection Networksxe2x80x9d by Fred Chong, Eran Egozy, Andre DeHon and Thomas Knight, Jr., MIT Transit Project, Transit Note #48, 1991-1993, available from the Internet at http://www.ai.mit.edu/projects/transit/tn48/tn48.html, which article is incorporated by reference herein in its entirety. As illustrated in an unlabeled figure (the figure on page 3) in the just-described article, xe2x80x9c[a] multipath MIN [is] constructed from 4xc3x972 (inputsxc3x97radix) dilation-2 crossbars and 2xc3x972 dilation-1 crossbars. Each of the 16 endpoints has two inputs and outputs for fault tolerance. Similarly, the routers each have two outputs in each of their two logical output directions. As a result, there are many paths between each pair of network endpoints.xe2x80x9d
Note that the MIN of the just-described article appears to be limited to fault tolerance, because the article states (under the heading xe2x80x9cRoutingxe2x80x9d on page 4) xe2x80x9c[o]ur fault tolerance results described below are independent of the details of routing and fault identification. That is, routing can be circuit-switched or packet-switched using any number of routing strategies. The fault tolerance results only depend on network topology and the routers"" ability to use their redundant connection in each logical direction to avoid faults.xe2x80x9d Note also that the just-described MIN was implemented using xe2x80x9c[a]n RN1 routing component [that] can be configured to act as a single 8-input, radix 4, dilation 2 routing component or as a pair of independent 4-input, radix-4, dilation 1 routing components.xe2x80x9d
The above-described article states (under the heading xe2x80x9cDilated Non-Interwired Networkxe2x80x9d on page 4) that xe2x80x9cdilated networks . . . connect all outputs in a given logical direction to the same physical routing component in the subsequent stage of the network. The topology is thus identical to the corresponding non-dilated bidelta network. . . . Such networks gain performance by having multiple paths.xe2x80x9d Moreover, the article states (under the heading xe2x80x9cDiscussionxe2x80x9d on page 16) xe2x80x9c[a]lthough we have concentrated upon the fault tolerance of multipath networks, these networks also perform well under unbalanced loading. Intuitively, the effects of hot-spots are very similar to component failures.xe2x80x9d
See also the following articles each of which is incorporated by reference herein in its entirety: (1) Clyde P. Kruskal and Marc Snir, xe2x80x9cThe Performance of Multistage Interconnection Networks for Multiprocessorsxe2x80x9d published in IEEE Transactions on Computers, C-32(12):1091-1098, December 1983; (2) Clyde P. Kruskal and Marc Snir, xe2x80x9cA Unified Theory of Interconnection Network Structurexe2x80x9d published in Theoretical Computer Science, pages 75-94, 1986; (3) Robert Grondalski, xe2x80x9cA VLSI Chip Set for Massively Parallel Architecturexe2x80x9d published in IEEE International Solid-State Circuits Conference, pages 198-199, 1987; and (4) K. Y. Eng et al., xe2x80x9cA Growable Packet (ATM) Switch Architecture: Design Principles and Applicationsxe2x80x9d published in IEEE Transactions on Communications, vol. 40, No. 2, Feb. 1992, pp. 423-430.
In accordance with the invention, an apparatus and method increase the data rate(s) in one or more portions (also called xe2x80x9clater portionsxe2x80x9d) of a communications network, while maintaining a normal data rate in one or more other portions (also called xe2x80x9cearlier portionsxe2x80x9d) that supply traffic to the just-described later portion(s). Depending on the implementation, a later portion that is being speeded up can include an intermediate stage or the final stage of a multistage interconnection network (MIN), or can include two or more such stages (also called xe2x80x9clater stagesxe2x80x9d). Therefore, speed up in only a portion of a MIN is implemented depending on location of the portion relative to input ports of the MIN. The just-described MIN is preferably of the connection-less type (e.g. packet-switched or cell-switched), although connection-oriented MINs (e.g. circuit-switched) can also be differentially speeded-up.
Stage specific speed-up of a MIN reduces (or even avoids) traffic contention that otherwise occurs when a disproportionate amount of traffic temporarily travels to a common destination. At the same time, stage specific speed-up reduces costs that are otherwise incurred when all portions of a MIN are speeded up. Therefore, a network that is speeded up in only a later portion (also referred to as xe2x80x9cdifferentially-speeded-up networkxe2x80x9d; e.g. having a normal speed initial stage and a speeded-up final stage) has the advantage of reducing traffic contention, while being less expensive than a network that is speeded-up in all portions.
In one embodiment, stage-specific speed up (wherein only later portions are speeded up) is used in a MIN that has at least two stages: an earlier stage to supply traffic (regardless of destination) to a later stage, wherein the later stage converges the traffic based on destination. Speed-up of a later portion of a MIN can be implemented by clocking a line faster in one or more output ports of a stage, as compared to an input port of the same stage. Alternatively, the speed-up can be implemented by providing one or more additional lines in a stage""s output ports (so an output port has lines greater in number than lines in an input port of the same stage). Note that the just-described additional lines carry different traffic, as opposed to multiple lines in a fault tolerant MIN that are used in a redundant manner to carry the same traffic. Irrespective of whether multiple lines or faster clocked lines are used in an output port, such an output port carries more traffic than the input port of the same stage, thereby to increase the speed of the output port.
In one embodiment, an MIN has stages identified by a stage number that is in the range 0 to Nxe2x88x921, wherein N is the total number of stages, and each stage has output ports speeded-up by a factor xe2x80x9cdxe2x80x9d, and the speed-up of output port of any stage with stage number I is n=dI. For example, in a three stage network having double speed components, the initial stage output ports have a normal speed, the center stage output ports have twice the normal speed, and the final stage output ports have four times the normal speed. In a variant of the just-described example, the initial stage output ports have normal speed, the center stage output ports have twice the normal speed, and the final stage output ports also have twice the normal speed.
If additional lines are used to implement speed up in the three-stage network, the center stage output port has multiple physically-distinct but logically-identical lines that are coupled to the final stage. When a number of packets (or cells) are to be transferred to the same destination by a center stage output port, the port distributes the packets (or cells) evenly among the physically-distinct but logically-identical lines. In one implementation, such a port statistically multiplexes traffic on to the logically-identical lines by saving in memory the identity of line(s) which have been used in a previous cycle, so that the next packet (or cell) is transferred to a line that was not used (or used least) in the previous cycle, thereby to ensure even distribution over time.