The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
In computer networks such as the Internet, packets of data are sent from a source to a destination via a network of elements including links (communication paths such as telephone or optical lines) and nodes (for example, routers directing the packet along one or more of a plurality of links connected to it) according to one of various routing protocols.
One class of routing protocol is the link state protocol. The link state protocol relies on a routing algorithm resident at each node. Each node on the network advertises, throughout the network, links to neighboring nodes and provides a cost associated with each link, which can be based on any appropriate metric such as link bandwidth or delay and is typically expressed as an integer value. A link may have an asymmetric cost, that is, the cost in the direction AB along a link may be different from the cost in a direction BA. Based on the advertised information in the form of a link state packet (LSP) each node constructs a link state database (LSDB), which is a map of the entire network topology, and from that constructs generally a single optimum route to each available node based on an appropriate algorithm such as, for example, a shortest path first (SPF) algorithm. As a result a “spanning tree” (SPT) is constructed, rooted at the node and showing an optimum path including intermediate nodes to each available destination node. The results of the SPF are stored in a routing information base (RIB) and based on these results the forwarding information base (FIB) or forwarding table is updated to control forwarding of packets appropriately. When there is a network change an LSP representing the change is flooded through the network by each node adjacent the change, each node receiving the LSP sending it to each adjacent node.
As a result, when a data packet for a destination node arrives at a node the node identifies the optimum route to that destination and forwards the packet to the next node along that route. The next node repeats this step and so forth.
It will be noted that in normal forwarding each node decides, irrespective of the node from which it received a packet, the next node to which the packet should be forwarded. In some instances this can give rise to a “loop”. In particular this can occur when the databases (and corresponding forwarding information) are temporarily de-synchronized during a routing transition, that is, where because of a change in the network, a new LSP is propagated that induces creating a loop in the RIB or FIB. As an example, if node A sends a packet to node Z via node B, comprising the optimum route according to its SPF, a situation can arise where node B, according to its SPF determines that the best route to node Z is via node A and sends the packet back. This can continue for as long as the loop remains although usually the packet will have a maximum hop count after which it will be discarded. Such a loop can be a direct loop between two nodes or an indirect loop around a circuit of nodes.
One solution that has been proposed to the looping problem is described in co-pending patent application Ser. No. 10/340,371, filed 9 Jan. 2003, entitled “Method and Apparatus for Constructing a Backup Route in a Data Communications Network” of Kevin Miles et al., (“Miles et al.”), the entire contents of which are incorporated by reference for all purposes as if fully set forth herein and discussed in more detail below. According to the solution put forward in Miles et al, where a repairing node detects failure of an adjacent component, then the repairing node computes a first set of nodes comprising the set of all nodes reachable according to its protocol other than nodes reachable by traversing the failed component. The repairing node then computes a second set of nodes comprising the set of all nodes from which a target node is reachable without traversing the failed component. The method then determines whether any intermediate nodes exist in the intersection between the first and second sets of nodes or a one-hop extension thereof and tunnels packets for the target node to a tunnel end point comprising a node in the intersection of the first and second sets of nodes. An extension of the approach is described in co-pending patent application Ser. No. 10/442,589, filed 20 May 2003, entitled “Method and Apparatus for Constructing a Transition Route in a Data Communications Network” of Stewart F. Bryant et al., (Bryant et al) the entire contents of which are incorporated by reference for all purposes as if fully set forth herein, and in which the approach can be extended to cover repairs for non-adjacent nodes.
Whilst such systems provide rapid network recovery in the event of a failed component, in some instances loops can occur. One such instance can be where two concurrent unrelated failures take place in the network. In that case a first repairing node adjacent the first failed component will institute its own first repair strategy and forward a packet according to that strategy, relying on the remaining nodes in the repair path using their normal forwarding. If, however, the packet traverses a second repairing node independently repairing around a second failed component, a loop may occur. In particular the second repairing node will have instituted its own repair strategy differing from normal forwarding and accordingly may return packets from the first repairing node back towards the first repairing node, giving rise to a loop. It will be apparent that such a problem can also arise in the transition route approach described above in Bryant et al and indeed in any case where a repair strategy is distributed across multiple nodes in a network.
Yet a further extension of known techniques is described in co-pending patent application Ser. No. 10/685,621 entitled “Method and Apparatus for Generating Routing Information in a Data Communications Network” of Michael Shand et al (hereinafter “Shand”, the entire contents of which are incorporated by reference for all purposes as if fully set forth herein. According to Shand nodes affected by a network failure are updated in a predetermined sequence to avoid looping. In this case it will be seen that problems still arise when multiple failures occur in the network. For example, assuming that the links A and B both fail at separate points in the network, a node C may find that it has to change in a first epoch for some destination D as a result of failure A and has to change in a second, separate epoch for the same destination D as a result of failure B with a conflicting sequential update strategy adopted for that failure. It will be seen that such concurrent failures cannot be converged using the approaches described above.
In the case that the failures are non-conflicting, that is, can be repaired by non-conflicting strategies, one possible solution is to order the invocation of convergence for each failure. However this requires the repair to remain in place longer which increases the time for which the network is vulnerable to a new potentially conflicting failure and extends the time for which repair paths are in use.
An alternative approach is described in “ip/ldp local protection” which is available at the time of writing on the file “draft-atlas-ip-local-protect-01.txt” in the directory “pub/id” of the domain “watersprings.org” on the World Wide Web. According to the approach described in this document, a computing node computes both a “primary next-hop” for packets for a destination together with an “alternate next-hop”. The alternate next hop is used in the case of failure of the primary next hop (failure either of the next-hop node or the link to the next hop-node). The alternate next-hop can be another neighbor node whose own shortest path to the destination does not include the computing node. In another case the alternate next-hop is a “U-turn alternate” comprising a neighbor whose primary next hop is the computing node, and which has as its alternate next-hop a node whose shortest path does not include the computing node. Although this document addresses multiple failure loop-free protection in some instances, this is only in the case of shared risk link groups (SRLGs), requires a signaling extension and is topology sensitive.