The present invention generally relates to a network switch adaptive routing mechanism an associated method of data routing. In particular, the present invention is concerned with a mechanism enabling adaptive routing selection within a crossbar switch and the associated method of data routing.
Network switches, also known as a network bridges, process and route data traveling across a network. There are two types of route available for connecting across a switch, namely fixed routes and adaptive routes. A data packet arriving with a fixed route must wait for the switch output described by the fixed route to become free before it can be travel across the switch. A data packet arriving at a switch having an adaptive route has a selection of possible outputs that can be connected to. Adaptive routes take advantage of the availability of multiple routes through the network and are a recognized method for improving the performance of a switch network when the network transports random traffic patterns. Adaptive routing is an important factor in producing a congestion free network. If a packet has a choice of routes then it is more likely to find one of a number of outputs free than it would if it were only able to select to a particular output.
If a switch network has more than one possible route from a source port to a destination port then adaptive routes can be used wherever it is reasonable to route in a different way. Some networks are very rich in connectivity giving many alternatives routes from one source to another destination.
There are many different types of network topology suitable for adaptive routing. An example of such a network topology is shown in FIG. 1 and is called a Fat Tree or Clos network. In FIGS. 1, 0 to 15 represent endpoints in a network 20 and 100 to 107 represent switches within the network 20. A data packet moving from endpoint 1 to endpoint 11 must pass through switches 100 and 102 but can go via any of the switches 104 to 107.
Within network 20, adaptive routing can be performed by switch 100 for the data packet moving from endpoint 1 to endpoint 11. In this case the adaptive route would have an output selection for all the links connecting to switches 104, 105, 106 and 107. If the outputs of switch 100 to switches 104, 105 and 107 are all busy sending other data packets from endpoints 0, 2 and 3 then the an adaptive route would chose to send the data packet from endpoint 1 to switch 106 since it is the only suitable free connection able to accept the data.
Many network switches are based on some form of crossbar connection structure. The crossbar connection structure performs the function of connecting any input into the crossbar to any output from the crossbar. An example of a crossbar switch 30 is shown in FIG. 2. In this diagram of a crossbar switch inputs 32a-32h form the rows and the switch outputs 34a-34h are connected along the columns. The switch connection points 36aa-36hh are controlled data switches that connect the inputs 32a-32h to the outputs 34a-34h. In crossbar switch 30 there are a total of sixty four switch connection points 34. Concurrent communications can take place within crossbar switches. Any input 32 can make a request to connect to any output 34. In the case of a multicast operation any input 32 can make a request to connect to a collection of outputs 34a-34h. In this example, switch point 32f is connected to output 34g by switch point 36fg which is shown as a solid dot to clarify that it is connected and not available for connection to any other input or output at that time. The remaining unconnected switch points 36 are shown as a cross in a circle. At the end of the transfer of a packet of data across the switch 30, the output 34g will become free to connect to another switch input 32.
When the situation arises that a number of inputs, say 32a-d are all requesting to the same output, say 34b, an arbitration operation is required to select which of inputs 32a-d will next connect to output 34b. For this purpose, a crossbar switch can be provided with an arbiter for each output of the crossbar. To ensure good network behavior, it is very important that the connection options are correctly prioritized and that requests of the same priority are dealt with fairly in such an arbitration operation.
If multiple inputs are requesting the same output, with each input having a different priority level, ensuring connection decisions are well prioritized can take a number of data transfer cycles. In such a situation, the requested switch output establish which input request has the highest priority and then, if more than one input has that same highest priority, an unbiased selection has to be made to ensure fairness in connection selection. There are a number of different types of arbiter that can be used to perform a fair selection. One type is a Least Recently Used (LRU) arbiter. This type of arbiter will always select the valid request that was last selected the least recently of all the valid requests being made. This type of arbiter gives good results but can be difficult to implement over a large structure such as a crossbar because many bits of state are required to hold the complete history of previous connections, especially with many inputs, and this state cannot be physically distributed over the whole crossbar as all inputs need to reference all of the state. The amount of state required to fully implement a fast LRU arbiter does not scale linearly with the number of inputs. Another type of arbiter which is considered to perform a fair selection is called a Round Robin arbiter. This type of arbiter uses a moving priority selection where the last successful connection is given the lowest priority for the next arbitration. It needs far fewer states and can be implemented in a distributed arbiter. In FIG. 3 a schematic diagram illustrates an example of prioritization in a round robin arbiter. In this example, there are six requesting inputs A to F. The arrows indicate the direction of roaming priority followed by the round robin arbiter. Assuming inputs A, C and D are asserting a request for connection and that input E was the last requestor to make a successful connection then, in this case, input A will have the highest priority and input D the lowest priority. This would mean that input A would be the next to connect to the desired output. After this connection between input A and the requested output was made and the transfer of data between the two completed, input A would become the lowest priority requestor if it continued to assert its request and input C would be the next input to be connected to the desired output. If these three requestors, input A, C and D, continued to assert their requests then the arbiter would choose them in the order A, C, D, A, C, D, A . . . and so on.
As can be seen from this illustration, round robin arbiters have a roaming priority selection mechanism. Roaming priority is however different from a priority assigned to with input connection request. Many network protocols include a priority mechanism to guarantee progress for important packets. The IEEE 802.1Q Ethernet standard includes an absolute priority value that appears in the Ethernet header. Typically high priority packets may be used for time critical system services. Sometimes this assigned priority may be dynamically changing; an example of this is if the assigned priority is associated with the age of the packet within the network such that older packets are more important than younger packets. If older packets are prioritized over younger packets then the maximum age of all packets within the whole network is significantly reduced. This can deliver better and more predictable application performance. Some network protocols work in a manner that both absolute priority and age related priority are included in the assigned priority to give the best network performance but absolute priority would be given preference over the priority associated with age. Absolute packet priority along with age priority would supersede the priority mechanism of a round robin arbiter. The priority mechanism within the round robin arbiter will only provide a fair result for all the highest priority/oldest packets making simultaneous requests.
Inclusion of an adaptive output selection for the inputs adds further complexity to the process. Adaptive output selection is typically implemented with a mechanism that attempts to connect to one of the possible outputs and then, if that fails, will back out of the connection attempt and then try another one of the possible outputs. It then continues to cycle through all the possible outputs until it finds one prepared to connect. If many inputs are all performing this operation at the same time and there are many outputs to choose from then this mechanism can fall apart and produce some terrible latencies for some connections. For example if we have a crossbar with 32 inputs and 32 outputs, the first input to make a request to the outputs will find a free output to connect with because none of them is currently connected. The second is very likely to find a free output because only 1 of the 32 outputs is currently connected to the first. However, when most of the outputs are already connected with an input then the chances of randomly selecting a free output can dramatically reduce. In our example there would only be a 1 in 32 chance when 31 connections have already been made and the final input is trying to find the one remaining free output. That input is then left with another problem. Should it withdraw its request to try another output or should it stay with the current selection in the hope it is about to finish transmitting the current packet and then be selected. If it withdraws then again it only has a 1 in 32 chance of guessing the one free output and it has no way of determining whether it is about to connect with its current selection or if it has a substantial wait before it will be able to connect with its current selection.
Low uncongested latency is important for network performance but a low maximum latency in a busy or congested network is far more important. One of the many ways to reduce maximum latency is to ensure that no opportunity to make a connection is wasted.
While an input is searching for a suitable connection to an appropriate output, it is not transmitting data. This causes a reduction in bandwidth that cannot be recovered later. Likewise, bandwidth is lost if an output completes the transmission of a data packet in one cycle and does not start transmitting a newly arbitrated packet from another input in the next cycle. A crossbar switch, with many inputs and outputs and high bandwidth wide data buses, can be a large structure making timing closure of the logic gate implementation on an ASIC difficult. This is usually addressed by pipelining the connection requests over one or more cycles.
It can therefore be seen that there is a need for an arbiter that can switch from one connection to another in a single cycle. It would be convenient if such an arbiter could also be able to do this while still maintaining fairness and, if needed, honoring any priority requirements.