The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
In computer networks such as the Internet, packets of data are sent from a source to a destination via a network of links (communication paths such as telephone or optical lines) and nodes (usually routers directing the packet along one or more of a plurality of links connected to it) according to one of various routing protocols.
One such protocol is the link state protocol. The link state protocol relies on a routing algorithm resident at each node. Each node on the network advertises, throughout the network, links to neighboring nodes and provides a cost associated with each link which can be based on any appropriate metric such as link bandwidth or delay and is typically expressed as an integer value. A link may have an asymmetric cost, that is, the cost in the direction AB along a link may be different from the cost in a direction BA. Based on the advertised information in the form of a link state packet (LSP) each node constructs a link
state database (LSDB) which is a map of the entire network topology and from that constructs generally a single optimum route to each available node based on an appropriate algorithm such as, for example a shortest path first (SPF) algorithm. As a result a “spanning tree” is constructed, rooted at the node and showing an optimum path including intermediate nodes to each available destination node. Because each node has a common LSDB (other than when advertised changes are propagating around the network) any node is able to compute the spanning tree rooted at any other node.
As a result when a packet for a destination node arrives at a node (which we term here the “first node”), the first node identifies the optimum route to that destination and forwards the packet to the next node along that route. The next node repeats this step and so forth.
It will be noted, therefore, that each node decides, irrespective of the node from which it received a packet, the next node to which the packet should be forwarded. In some instances this can give rise to a “loop”. In particular this can occur when the databases (and corresponding forwarding information) are temporarily de-synchronized during a routing transition, that is, where because of a change in the network, a new LSP is propagated. As an example if node A sends a packet to node Z via node B, comprising the optimum route according to its SPF, a situation can arise where node B, according to its SPF determines that the best route to node Z is via node A and sends the packet back. This can continue indefinitely although usually the packet will have a maximum hop count after which it will be discarded. Such a loop can be a direct loop between two nodes or an indirect loop around a circuit of nodes.
In conventional systems, when a link fails this is identified by an adjacent node in a medium specific manner. This instigates a routing transition whereby the neighboring node advertises the link failure to the remainder of the network. This can be done by simply removing the link from the LSP or, in some circumstances, setting its cost to an integral value high enough to direct all traffic around the failed link. This value is often termed “infinity” and it will be seen that the approaches are effectively the same.
However the LSP advertising the failure takes a finite time to propagate through the network and each node must then re-run its SPF and pass the newly generated routes down to its forwarding mechanism as a result of which there will be inconsistencies between the LSDBs maintained at different nodes on the network. In some circumstances this can give rise to the loops discussed above which may persist until the LSDBs are once more consistent, which can take several hundred milliseconds.
The underlying causes of looping can be better understood with reference to FIG. 1.
A simple network is shown designed generally 10 and including nodes A, B, D, X, Y reference numerals 12, 14, 16, 18, 20 respectively. The nodes are joined by links in a circuit running ABDYXA, a link 22 joining nodes A and B. All of the links have a cost 1 except for a link 24 joining nodes Y and D which has a cost 5. When all of the links are operating, a packet arriving at node X and destined for node D will take the route XABD with a cost of 3, as opposed to the route XYD which has a cost of 6. Similarly, a packet arriving at node Y destined for node D will take route YXABD with a cost of 4 rather than YD with a cost of 5. If the link 22 between nodes A and B fails then node A advertises the failure by sending out an LSP effectively setting the cost for link 22 to “infinity”. At some point this LSP will have reached X allowing it to update its LSDB but will not yet have arrived at node Y. As a result a packet now arriving at node X destined for node D will be forwarded towards Y as part of the route XYD at a cost 6 as opposed to the route XABD at a cost infinity. However when that packet reaches node Y, as node Y still records the cost of the link 22 between nodes A and B as 1, according to its SPF the lowest cost route is still via XABD at a cost 4. Accordingly the packet is returned to node X which again tries to send it to node Y and so forth. It will be seen that a loop of this nature can be a direct loop between two nodes or an indirect loop around a circuit of nodes.
Loops of this nature are undesirable because they use up bandwidth on the network until the packets are dropped when the hop count reaches the appropriate threshold.
One proposed solution to advertising link failure is described in Paolo Narvaez, Kai-Yeung Siu and Hong-Yi Tzeng, “Fault-Tolerant Routing in the Internet without Flooding”, proceedings of the 1999 IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, San Juan, Puerto Rico, April 1999. According to this solution when a link fails, rather than flooding the network with LSPs only those nodes on the shortest or all “restoration paths” around the failed link are notified and each of those nodes updates its routing table only in relation to the set of destinations affected by the link failure. As a result packets are forced along a restoration path. However this approach requires significant perturbation of the routing protocols at each node involved, and temporary loops may be formed.