This invention relates to multi-dimensional network switches, and more particularly to store-and-forward packet switches.
Expansion of modern computer networks such as the Internet has necessitated faster and larger network switches. Software-based routers and hardware-based network switches are used at intermediate points in networks to either route packets or make direct connections between input and output ports. These mid-level routers and switches feed into major network points that often experience significant congestion.
Some network points are so complex and carry such heavy traffic that traditional switches or routers fail, resulting in dropped packets and lost connections. The traditional telecommunications infrastructure makes direct, dedicated connections between input (ingress) and output (egress) ports. Digital cross-connect circuits are used. Packet-based traffic routing has also been used where data is packetized, and the packets are switched or routed toward their destinations.
As traffic increases, the network switching points must be expanded. Benes networks are often used for scalable, direct connection networks. FIG. 1 shows a small Benes network. Benes network 10 is constructed from switches 12. Each switch 12 has 2 inputs and 2 outputs. Each switch 12 can either pass the 2 inputs straight through to the 2 outputs, or cross over the inputs to the outputs, swapping each input to the other output.
The example network 10 has 8 input (ingress) ports and 8 output (egress) ports. Switches 12 are arranged into columns or stages, with each stage having 4 switches. Interconnection between stages includes cross-over wiring, so that some of the outputs from one stage are applied to different switches in the next stage. The interconnections between stages satisfy a mathematically defined permutation that ensures network connectivity. Each egress port is reachable from any ingress port.
FIG. 2 shows a Benes network configured to make desired connections. Switch 14 in stage (column) 1 has been programmed to cross-over, while the other switches 12 in stage 1 are programmed to pass straight through. In stage 3, switch 16 is programmed to cross over, and in stage 4, three switches 18 are programmed for cross-over. In the final stage 5, two switches 22 are programmed for cross-over.
Direct connections are made from ingress port 2 to egress port 4, from ingress port 3 to egress port 7, and from ingress port 6 to egress port 2. Ingress ports 0, 1, 4, 5, 7 are connected to egress ports 5, 3, 0, 1, 6, respectively. One-to-one direct connections are made.
In Benes network 10, a central routing algorithm determines which switches 12 need to be programmed for cross-over and which for straight-through connection to obtain the desired connections. All possible connections among ingress and egress ports are possible, but the network must be reconfigured when different port connections are desired. This reconfiguration causes some downtime for the network while the switches are being reconfigured. Often two Benes networks are used in parallel so that one can be reconfigured while the other is passing data.
There are N! possible permutations, where a permutation is a set of one-to-one connections between N ingress and N egress ports. Since all possible connections are possible with Benes network 10, the network is non-blocking. Blocking networks are less desirable, since one connection may prevent another unrelated connection from being established.
When a failure occurs in one of the switches in Benes network 10, sometimes a different route can be established that avoids the failed switch. However, the network is no longer non-blocking since the new route may block another route. When two Benes networks are used in parallel, the network with the failed switch can be shut down while the other network handles all traffic, but cannot be re-provisioned without a service interruption while one network has failed
An N-input, N-output Benes network using 2xc3x972 switches consists of (2*log2(N)xe2x88x921) stages. Each stage has N/2 switches.
FIG. 3 illustrates a Batcher network using 2xc3x972 switches. A packet switched through a Batcher network contains its destination egress port address. The routing decision made at each switch in network 20 is dependent on the destination egress port addresses of all the packets appearing at its input ports. Each switch forwards packets along its output ports enabling packets to get one step closer to their egress ports.
Each stage 24-29 contains four switches. Each switch has 2 inputs and 2 outputs. A packet received from either input is sorted to one or the other of the switch""s outputs by a sorting algorithm. For example, stages 24, 29 have switches that sort among pairs of adjacent ports or signal lines, while stage 28 sorts among ports that are 2 lines apart. Stages 25, 26 sort among different pairs of signal lines.
A N-input N-output Batcher network consisting of 2xc3x972 switches has log2(N)*(log2(N)+1)/2 stages. Each stage has N/2 switches.
Each switch executes its own sorting or routing algorithm, independent of the other switches. Reconfiguration of the global network is not necessary. The network is self-routing. However, when a switch fails, all packets going through that switch can be incorrectly routed or lost. The sorting network thus suffers from fault intolerance.
Other networks such as Clos networks can also be used. Clos networks also use centralized routing algorithms.
What is desired is a non-blocking yet fault-tolerant network architecture. It is desired to achieve the non-blocking quality of the direct connections of a Benes network without making direct, point-to-point connections. It is desired to route packets through the network, as does a sorting network, but with fault tolerance. A packet-switching network is desired that is both fault tolerant and non-blocking. A network that does not have to be stopped and re-configured when different connections are made is desirable. An adaptive, fault-tolerant packet-switching method is desired. An adaptive, fault-tolerant routing method is desired. A self-routing network is desired that does not have to be re-configured as different connections are needed.
An adaptively-routed interconnection network has a plurality of ingress ports for receiving packets. A plurality of egress ports transmit packets, while a plurality of switches each have input links for receiving packets from other switches in the network and output links for sending packets to other switches in the network. A packet memory stores packets received from the input links until transmission over the output links.
Each packet stored in the packet memory has a header that includes a destination address of a destination switch in the plurality of switches. The destination switch is coupled to a destination egress port in the plurality of egress ports that the packet is to be transmitted out of. A random address in the header is for a random switch. A phase indicator in the header indicates a first phase when the packet is forwarded to the random switch and a second phase when the packet is forwarded to the destination switch.
A routing controller reads the header of a packet stored in the packet memory. It determines a selected output link in the plurality of output links to send the packet over. When the random address read from the header matches the address of the switch, the phase indicator is reset to indicate that the packet is in the second phase and no longer in the first phase.
(1) When the phase indicator indicates that the packet is in the first phase, the random address from the header is used to determine the selected output link. The selected output link is in a route toward the random switch;
(2) When the phase indicator indicates that the packet is in the second phase, the destination address from the header is read to determine the selected output link. The selected output link is in a route toward the destination switch.
The packet is sent over the selected output link on the route toward the random switch when the phase indicator indicates the packet is in the first phase, but the packet is sent over the selected output link on the route toward the destination switch when the phase indicator indicates the packet is in the second phase. The packet is removed from the network by the destination switch and transmitted over the egress port coupled to the destination switch when the destination switch determines that the destination address in the header matches the address of the destination switch. Thus packets are routed to the random switch during the first phase, but routed to the destination switch during the second phase after the packet reaches the random switch.
In further aspects of the invention, the output links and the input links of switches form a multi-dimensional network topology.
In still further aspects, the random address stored in the header is randomly generated to select as the random switch any switch in the network, including switches that are not on the route to the destination switch. Thus packets are first routed to a random switch within the network before being routed to their destination. A different random address is generated for each packet received by the network through an ingress port. Network congestion is reduced as packets are dispersed to random switches within the network before routing to destinations.
In still further aspects of the invention, when a packet""s viable output links are congested, and the packet""s waiting time in a switch exceeds a threshold value,
(1) the routing controller sets the phase indicator to indicate the first phase;
(2) the routing controller randomly generates a new random address of another random switch;
(3) the routing controller over-writes the random address in the header with the new random address; and
(4) the routing controller uses the new random address from the header to determine the selected output link, the selected output link being in a route toward the another random switch.
Thus the network avoids congested links by re-routing to the another random switch when a packet""s waiting time in a switch exceeds a threshold value.
In other aspects, when the selected output link is connected to a faulty switch, the routing controller selects a different output link on a different route toward the random switch when the phase indicator indicates the first phase. Otherwise it selects different output link on a different route toward the destination switch when the phase indicator indicates the second phase. Thus the routing controller adapts routing to bypass the faulty switch.
In still further aspects of the invention, when the selected output link is connected to a faulty switch, and the routing controller cannot locate a different output link on a different route toward either the random switch when the phase indicator indicates the first phase or the destination switch when the phase indicator indicates the second phase:
(1) the routing controller sets the phase indicator to indicate the first phase;
(2) the routing controller randomly generates a new random address of another random switch;
(3) the routing controller over-writes the random address in the header with the new random address;
(4) the routing controller uses the new random address from the header to determine the selected output link, the selected output link being in a route toward the another random switch,
Thus the network is fault tolerant since the faulty switch is bypassed by re-routing to the another random switch when only routes through the faulty switch are available.