Data communication networks may include various computers, servers, nodes, routers, switches, bridges, hubs, proxies, and other network devices coupled together and configured to pass data to one another. These devices will be referred to herein as “network elements.” Data is communicated through the data communication network by passing protocol data units, such as data frames, packets, cells, or segments, between the network elements by utilizing one or more communication links. A particular protocol data unit may be handled by multiple network elements and cross multiple communication links as it travels between its source and its destination over the network.
The various network elements on the communication network communicate with each other using predefined sets of rules, referred to herein as protocols. Different protocols are used to govern different aspects of the communication, such as how signals should be formed for transmission between network elements, various aspects of what the protocol data units should look like, how packets should be handled or routed through the network by the network elements, and how information associated with routing information should be exchanged between the network elements. Networks that use different protocols operate differently and are considered to be different types of communication networks. A given communication network may use multiple protocols at different network layers to enable network elements to communicate with each other across the network.
In packet-forwarding communications networks, a node can learn about the topology of the network and can decide, on the basis of the knowledge it acquires of the topology, how it will route traffic to each of the other network nodes. Frequently, the main basis for selecting a path is path cost, which can be specified in terms of a number of hops between nodes, or by some other metric such as bandwidth of links connecting nodes, or both. Open Shortest Path First (OSPF) and Intermediate System-to-Intermediate System (IS-IS) are widely used link-state protocols which establish shortest paths based on each node's advertisements of path cost.
Various shortest path algorithms can be used to determine if a given node is on the shortest path between a given pair of bridges. An all-pairs shortest path algorithm such as Floyd's algorithm or Dijkstra's single-source shortest path algorithm can be implemented by the nodes to compute the shortest path between pairs of nodes. It should be understood that any other suitable shortest path algorithm could also be utilized. The link metric used by the shortest path algorithm can be static or dynamically modified to take into account traffic engineering information. For example, the link metric can include a measure of cost such as capacity, speed, usage and availability.
There are situations where multiple equal cost paths exist through a network between a given pair of nodes. ISIS and OSPF use a simplistic uni-directional tie-breaking process to select between these multiple equal-cost paths, or just spread traffic across the equal-cost paths. The spreading algorithms are not specified and can vary from router to router. Alternatively, each router may make a local selection of a single path, but without consideration of consistency with the selection made by other routers. Consequently, in either case, the reverse direction of a flow is not guaranteed to use the path used by the forward direction. This is sufficient for unicast forwarding where every device will have a full forwarding table for all destinations and promiscuously accepts packets to all destinations on all interfaces. However, this does not work well in other situations such as multicast routing, when consistent decisions must be made, and when bi-directional symmetry is required to enable the actual forwarding paths in a stable network to exhibit connection oriented properties.
Multicast routing protocols such as Multicast Open Shortest Path First (MOSPF) depend on each router in a network constructing the same shortest path tree. For this reason, MOSPF implements a tie-breaking scheme based on link type, LAN vs. point-to-point, and router identifier, to ensure that identical trees are produced. However, basing the tie-breaking decision on the parent with the largest identifier implies that the paths used by the reverse flows may not be the same as the paths used by the forward flows.
There is a requirement in some emerging protocols, such as Provider Link State Bridging (PLSB) which is being defined by the Institute of Electrical and Electronics Engineers (IEEE) as proposed standard 802.1aq, to preserve bi-directional congruency of forwarding across the network for both unicast and multicast traffic, such that traffic will use a common path in both forward and reverse flow directions. Accordingly, it is important that nodes consistently arrive at the same decision when tie-breaking between equal-cost paths and that the tie breaking process be independent of which node is the root for a given computation. Furthermore, it is desirable that a node can perform the tie-breaking with a minimum amount of processing effort.
Generally, any tie-breaking algorithm should be complete, which means that it must always be able to choose between two paths. Additionally, the tie-breaking algorithm should be commutative associative, symmetric, and local. These properties are set forth below in Table I:
TABLE IRequirementDescriptionCompleteThe tie-breaking algorithm mustalways be able to choose betweentwo pathsCommutativetiebreak(a, b) = tiebreak(b, a)Associativetiebreak(a, tiebreak(b, c)) =tiebreak(tiebreak(a, b), c)Symmetrictiebreak(reverse(a), reverse(b)) =reverse(tiebreak(a, b))Localtiebreak(concat(a, c), concat(b, c)) =concat(tiebreak(a, b), c)
The essence of a tie-breaking algorithm is to always ‘work’. No matter what set of paths the algorithm is presented with, the algorithm should always be able to choose one and only one path. First and foremost, the tie-breaking algorithm should therefore be complete. For consistent tie-breaking, the algorithm must produce the same results regardless of the order in which equal-cost paths are discovered and tie-breaking is performed. That is, the tie-breaking algorithm should be commutative and associative. The requirement that tie-breaking between three paths must produce the same results regardless of the order in which pairs of paths are considered is not as obvious and yet it is absolutely necessary for consistent results as equal-cost paths are discovered in a different orders depending on the direction of the computation through the network. The tie-breaking algorithm must also be symmetric, i.e. the tie-breaking algorithm must produce the same result regardless of the direction of the path: the shortest path between two nodes A and B must be the reverse of the shortest path between B and A.
Finally, locality is a very important property of shortest paths that is exploited by routing systems. The locality property simply says that: a sub-path of a shortest path is also a shortest path. This seemingly trivial property of shortest paths has an important application in packet networks that use destination-based forwarding. In these networks, the forwarding decision at intermediate nodes along a path is based solely on the destination address of the packet, not its source address. Consequently, in order to generate its forwarding information, a node needs only compute the shortest path from itself to all the other nodes and the amount of forwarding information produced grows linearly, not quadratically, with the number of nodes in the network. In order to enable destination-based forwarding, the tie-breaking algorithm must therefore preserve the locality property of shortest paths: a sub-path of the shortest path selected by the tie-breaking algorithm must be the shortest path selected by the tie-breaking algorithm.
Considerations of computational efficiency put another seemingly different requirement on the tie-breaking algorithm: the algorithm should be able to make a tie-breaking decision as soon as equal-cost paths are discovered. For example, if an intermediate node I is connected by two equal-cost paths, p and q, to node A and by another pair of equal-cost paths, r and s, to node B, there are therefore four equal-cost paths between nodes A and B, all going through node I: p+r, p+s, q+r, q+s.
The equal-cost sub-paths between A and I (p and q) will be discovered first when computing a path between nodes A and B. To avoid having to carry forward knowledge of these two paths, the tie-breaking algorithm should be able to choose between them as soon as the existence of the second equal-cost shortest sub-path is discovered. The tie-breaking decisions made at intermediate nodes will ultimately affect the outcome of the computation. By eliminating one of the two sub-paths, p and q, between nodes A and I, the algorithm removes two of the four shortest paths between nodes A and B from further consideration. Similarly, in the reverse direction, the tie-breaking algorithm will choose between sub-paths r and s (between nodes B and I) before making a final determination on the path between A and I. These local decisions must be consistent with one another and, in particular, the choice between two equal-cost paths should remain the same if the paths are extended to a subsequent node in the network.
It turns out that the symmetry and locality conditions are both necessary and sufficient to guarantee that the tie-breaking algorithm will make consistent local decisions, a fact that can be exploited to produce very efficient implementations of the single-source shortest path algorithm in the presence of multiple equal-cost shortest paths.
The list of requirements set out in Table 1 is not intended to be exhaustive, and there are other properties of shortest paths that could have been included in Table 1. For example, if a link which is not part of a shortest path is removed from the graph, the shortest path selection should not be affected. Likewise, the tie-breaking algorithm's selection between multiple equal-cost paths should not be affected if a link which is not part of the selected path is removed from the graph representing the network, even if this link is part of some of the equal-cost paths that were rejected by the algorithm.
Many networking technologies are able to exploit a plurality of paths such that they are not confined to a single shortest path between any two points in the network. This can be in the form of connectionless networks whereby the choice of next hop into a plurality of paths can be arbitrary at every hop, and has no symmetry requirement, or can be strictly connection oriented where the assignment to an end to end path is confined to the ingress point to the network. Ethernet and in particular the emerging 802.1aq standard being an example where there is a requirement for both symmetry and connection oriented behavior, and the dataplane can support a plurality of paths between any two points in the network. Ethernet achieves this by being able to logically partition the filtering database by VLAN such that a unique path may exist per VLAN. The challenge is to effectively exploit the available connectivity by instantiating connectivity variations in each VLAN and maximizing the diversity of the path set such that a minimum number of path variations fully exploits the network. The ratio of path variations required to exploit the network vs. the number of possible unique paths that actually exist is called the dilation ratio, the desirable goal being to minimize this ratio as it minimizes the amount of state and computation associated with fully exploiting the network.
Many techniques have been tried to increase the path diversity where multiple equal cost paths exist between a pair of nodes while explicitly seeking to maintain the properties outlined in table 1 above. U.S. Patent Application Publication No. 2009/0168768 provides one technique, and extensions to this have been attempted as well. For example, algorithmic manipulation of node IDs has been found to work, but does not increase the amount of path diversity significantly. For example, link utilization for an 8×4 fully meshed node array ranged from 63% to 67% using node ID manipulation. Additionally, where there are fewer than four paths, attempting to rank four unique paths on a set of node IDs breaks down such that the second highest path ends up being equal to the lowest, or the second lowest instead of exploring overlapping permutations. Further, the size of the node identifier and whether the node ID space is sparse or dense has little effect.
Another technique that has been tried is to distribute paths on the basis of maximizing load diversity. This technique has been found not to work, because it does not produce an acyclic planar graph, by which is meant that it requires more than one path to a given node in a single shortest path tree. Additionally, this technique requires advance knowledge of future computational results as the network is traversed further from the computing node. In essence, there is nothing to join the gap when working from either end into the middle of the network.
Yet another technique that has been tried is to select specific well known rankings (lowest, highest, next lowest, next highest . . . ). This works for two paths (high/low) but breaks down when it is extended to selection of larger numbers of paths. This technique fails because the intermediate nodes cannot anticipate how the ranked set of paths they generate fare when combined with other nodes' ranked sets at the next hop, and so the locality property is lost. As a consequence, the fragments of the shortest path ranked by the parent will be discontiguous with the path expected by the child. Further, there is no guarantee that path rankings other than low and high at the children nodes will produce an acyclic tree. Additionally, selecting the second highest, second lowest, etc., produces dependencies, since a failure of the highest path or lowest path will affect all paths ranked off that path. And finally, even postulating such an algorithm could be made to work, it would lose the property of being able to resolve portions of the shortest path as identified, all state needing to be carried forward as the Dijkstra computation progressed, significantly impacting the performance of the algorithm. As a result, the low and high rankings are the only reliable rankings that may be selected from a set of ranked paths. Thus, simply selecting additional rankings cannot be used to increase the path diversity when selecting more than two paths.
U.S. Patent Application Publication No. 2009/0168768 discloses one tie breaking process, the content of which is hereby incorporated herein by reference. Although the process described in this application works well, it would still be advantageous to provide another way to get good path distribution in the presence of multiple paths, to enable traffic to be spread across the available paths. Additionally, the distribution should preferably be an a feature of normal operation, and not require complicated network design, and minimize any explicit configuration by a network administrator.