The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
In computer networks such as the Internet, packets of data are sent from a source to a destination via a network of elements including links (communication paths such as telephone or optical lines) and nodes (for example, routers directing the packet along one or more of a plurality of links connected to it) according to one of various routing protocols.
One routing protocol used, for example, in the internet is Border Gateway Protocol (BGP). BGP is used to route data between routing domains such as autonomous systems (AS) comprising networks under a common administrator and sharing a common routing policy. BGP routers exchange full routing information during a connection session for example using Transmission Control Protocol (TCP) allowing inter-autonomous system routing. The information exchanged includes various attributes including a next-hop attribute. For example where a BGP router advertises a connection to a network, for example in a form of an IP address prefix, the next-hop attribute comprises the IP address used to reach the BGP router.
Edge or border BGP routers in a first AS communicate with eBGP peers in a second AS via exterior BGP (eBGP). In addition BGP routers within an AS exchange reachability information using interior BGP (iBGP). As a very large number of routes may be advertised in this manner an additional network component comprising a route reflector is commonly provided which sets up a session with each BGP router and distributes reachability information to each other BGP router.
The border routers in respective AS's can advertise to one another, using eBGP, the prefixes (network destinations) reachable from them, the advertisements carrying information such as AS-path, indicating the AS's through which the route advertisement has passed including the AS in which the advertising border router itself is located, and a BGP Community attribute indicating the manner in which the advertisement is to be propagated. For example if an eBGP advertisement is received with Community attribute No-Advertise, then the border router receiving the advertisement does not advertise the route information to any of its peers, including other routers in its AS. When the routes are advertised internally, additional information such as a local preference and a nexthop field are included. The local preference attribute sets a preference value to use of that particular route for example for a given set of prefixes such that where more than one route is available to other border routers in the AS they will select the route with the highest local preference. The next-hop attribute provides the IP address used for the link between the border router in the AS and its eBGP peer.
To reduce the amount of iBGP messages further, route reflectors may only advertise the best path for a given destination to all border routers in an AS. Accordingly all border routers will forward traffic for a given destination to the border router identified in the best path advertisement. Forwarding of packets within the AS may then simply use Interior Gateway Protocol (IGP) as described in more detail below where the IGP forwarding table will ensure that packets destined for the eventual destination will be forwarded within the AS towards the appropriate border router. Alternatively an ingress border router receiving incoming packets may tunnel the packets to the appropriate egress border router, that is, encapsulate the packets to a destination egress border router for transit across the AS for example using IP or MPLS tunnels. The packets are then decapsulated at the egress border router and forwarded according to the packet destination header.
Within each AS the routing protocol typically comprises an interior gateway protocol (IGP) for example a link state protocol such as open shortest path first (OSPF) or intermediate system—intermediate system (IS-IS).
The link state protocol relies on a routing algorithm resident at each node. Each node on the network advertises, throughout the network, links to neighboring nodes and provides a cost associated with each link, which can be based on any appropriate metric such as link bandwidth or delay and is typically expressed as an integer value. A link may have an asymmetric cost, that is, the cost in the direction AB along a link may be different from the cost in a direction BA. Based on the advertised information in the form of a link state packet (LSP) or link state advertisement (LSA) each node constructs a link state database (LSDB), which is a map of the entire network topology, and from that constructs generally a single optimum route to each available node based on an appropriate algorithm such as, for example, a shortest path first (SPF) algorithm. As a result a “spanning tree” (SPT) is constructed, rooted at the node and showing an optimum path including intermediate nodes to each available destination node. The results of the SPF are stored in a routing information base (RIB) and based on these results the forwarding information base (FIB) or forwarding table is updated to control forwarding of packets appropriately. When there is a network change an LSP representing the change is flooded through the network by each node adjacent the change, each node receiving the LSP sending it to each adjacent node.
As a result, when a data packet for a destination node arrives at a node the node identifies the optimum route to that destination and forwards the packet to the next node along that route. The next node repeats this step and so forth.
According to the approach described in Shand et al “not-via” addresses are used in a manner which can be understood with reference to FIG. 1 which is a schematic diagram showing an autonomous system network 100. The network includes nodes S and P, reference numerals 102, 104 which are joined by a link 106. Node S is connected to nodes B and D, reference numerals 108, 110, by links 112, 114 respectively. Nodes B and D are connected to nodes A and C respectively, reference numerals 116, 118 by links 120, 122 respectively. Node P is connected to nodes E and G, reference numerals 124, 126 via links 128, 130 respectively and nodes E and G are connected to respective nodes F and H, reference numerals 128, 130 via respective links 132, 134.
In order to repair a failure in the network each node adjacent to the failure acting as instigating repair node computes a repair or backup path around the failure. Then when a failure is detected an instigating repair node will forward subsequent packets which otherwise would have traversed the failure, via the repair path to a receiving repair node. For example where link 106 fails between node S and P and node S detects the failure then packets subsequently received for node E, F, G or H, which otherwise would have gone via link 106, are forwarded according to the pre-computed repair path (using connectivity not shown in FIG. 1). This approach is sometimes termed fast reroute. The manner in which the repair path is constructed and propagated is by giving each node/interface (ie its connection to each link to adjacent nodes), in addition to its normal address, a propagatable repair address which is reachable via a repair path notvia the failure component, the “notvia address”. For example node P may have repair addresses P notvia S (represented here as Ps). Each other node will have computed its nexthop for each of the notvia addresses. Hence when node S detects failure of link 106 it tunnels subsequent packets for node P to address Ps. Its nexthop, having precomputed its nexthop for Ps will forward accordingly and so forth. It will be noted that node S can forward the packet to Ps in any appropriate way for example by tunneling it to that address. Similarly any packets received at node S for node E, F, G, H will also be tunneled to Ps. Upon decapsulation of the packet at node P it will then be forwarded normally from node P to node E, F, G or H as appropriate following the original path.
Shand et al further discloses various manners of reducing the SPF calculation overhead using incremental iSPF. Incremental SPF's will be well known to the skilled reader and are not described in detail here but merely summarized for the purposes of explanation. In particular as a first step, only a partial SPF is computed for the not-via address once the SPT branch attached to the failed component has been excised. All addresses no longer attached are recomputed and reattached, however the incremental calculation is terminated when all of the addresses previously reached via the affected component are reattached. A further advantage is achieved according to Shand et al by ensuring that, whilst a repairing node upon detecting a failure affecting an incoming packet for a normal address will tunnel that packet to a not-via address, it will not attempt to repair an incoming packet itself already destined for a not-via address, that is, a packet that has already been repaired. In particular this provides a loop prevention strategy for example where two nodes adjacent a failure might otherwise try to repair a packet which could cause it to loop back and forth between them. According to the approach described in Shand et al the repairing node can continue to tunnel packets to a not-via address until the network has reconverged at which point the repairing node can revert to normal forwarding.
It is important to minimize packet loss in the case of network component failure, both intra-domain (IGP) and inter-domain (eBGP). For example in the case of intra-domain link failure, ISPs use various techniques to react quickly to the failure while convergence is taking place including handling of the failure by other layers or implementing fast reroute techniques for example of the type described in co-pending patent application Ser. No. 11/064,275, filed Feb. 22nd, 2005, entitled “Method and Apparatus for Constructing a Repair Path Around a Non-Available Component in a Data Communications Network” of Mike Shand et al, (“Shand et al”), the entire contents of which are incorporated by reference for all purposes as if fully set forth herein.
However problems arise with existing approaches when a failure partitions an IGP domain such as an autonomous system. Partitioning of the autonomous system occurs when a failure means that there is no path from one side of the failure to the other, sometimes termed a “single point of failure”. For example in such circumstances, using the approach described in Shand et al it is no longer possible to reach destinations in one partition from the other when repair is implemented as the computation of the FIB entries for not-via addresses will be unable to find a route across the partition. Importantly, it is also no longer possible to forward transit traffic, that is inter-AS traffic through the AS. One approach to this problem is described in International Standard ISO/IEC10589 which describes a method of partition repair for intermediate system-intermediate system (IS-IS) as IGP where a partition in a level 1 area is repaired using the level 2 domain. However there is no mechanism for repair of the level 2 domain itself.