1. Field of the Invention
The present invention relates to computer networks and, more particularly, to protecting against failure of a network element using Multi-Topology Repair Routing (MTRR) in a computer network.
2. Background Information
A computer network is a geographically distributed collection of nodes interconnected by communication links and segments for transporting data between end nodes, such as personal computers and workstations. Many types of networks are available, with the types ranging from local area networks (LANs) to wide area networks (WANs). LANs typically connect the nodes over dedicated private communications links located in the same general physical location, such as a building or campus. WANs, on the other hand, typically connect geographically dispersed nodes over long-distance communications links, such as common carrier telephone lines, optical lightpaths, synchronous optical networks (SONET), or synchronous digital hierarchy (SDH) links. The Internet is an example of a WAN that connects disparate networks throughout the world, providing global communication between nodes on various networks. The nodes typically communicate over the network by exchanging discrete frames or packets of data according to predefined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP). In this context, a protocol consists of a set of rules defining how the nodes interact with each other. Computer networks may be further interconnected by an intermediate network node, such as a router, to extend the effective “size” of each network.
Since management of interconnected computer networks can prove burdensome, smaller groups of computer networks may be maintained as routing domains or autonomous systems. The networks within an autonomous system (AS) are typically coupled together by conventional “intradomain” routers configured to execute intradomain routing protocols, and are generally subject to a common authority. To improve routing scalability, a service provider (e.g., an ISP) may divide an AS into multiple “areas” or “levels.” It may be desirable, however, to increase the number of nodes capable of exchanging data; in this case, interdomain routers executing interdomain routing protocols are used to interconnect nodes of the various ASes. Moreover, it may be desirable to interconnect various ASes that operate under different administrative domains. As used herein, an AS, area, or level is generally referred to as a “domain,” and a router that interconnects different domains is generally referred to as a “border router.”
An example of an inter-domain routing protocol is the Border Gateway Protocol version 4 (BGP), which performs routing between domains (ASes) by exchanging routing and reachability information among neighboring inter-domain routers of the systems. An adjacency is a relationship formed between selected neighboring (peer) routers for the purpose of exchanging routing information messages and abstracting the network topology. The routing information exchanged by BGP peer routers typically includes destination address prefixes, i.e., the portions of destination addresses used by the routing protocol to render routing (“next hop”) decisions. Examples of such destination addresses include IP version 4 (IPv4) and version 6 (IPv6) addresses. BGP generally operates over a reliable transport protocol, such as TCP, to establish a TCP connection/session. The BGP protocol is well known and generally described in Request for Comments (RFC) 1771, entitled A Border Gateway Protocol 4 (BGP-4), published March 1995.
Examples of an intradomain routing protocol, or an interior gateway protocol (IGP), are the Open Shortest Path First (OSPF) routing protocol and the Intermediate-System-to-Intermediate-System (IS-IS) routing protocol. The OSPF and IS-IS protocols are based on link-state technology and, therefore, are commonly referred to as link-state routing protocols. Link-state protocols define the manner with which routing information and network-topology information are exchanged and processed in a domain. This information is generally directed to an intradomain router's local state (e.g., the router's usable interfaces and reachable neighbors or adjacencies). The OSPF protocol is described in RFC 2328, entitled OSPF Version 2, dated April 1998 and the IS-IS protocol used in the context of IP is described in RFC 1195, entitled Use of OSI IS-IS for routing in TCP/IP and Dual Environments, dated December 1990, both of which are hereby incorporated by reference.
An intermediate network node often stores its routing information in a routing table maintained and managed by a routing information base (RIB). The routing table is a searchable data structure in which network addresses are mapped to their associated routing information. However, those skilled in the art will understand that the routing table need not be organized as a table, and alternatively may be another type of searchable data structure. Although the intermediate network node's routing table may be configured with a predetermined set of routing information, the node also may dynamically acquire (“learn”) network routing information as it sends and receives data packets. When a packet is received at the intermediate network node, the packet's destination address may be used to identify a routing table entry containing routing information associated with the received packet. Among other things, the packet's routing information indicates the packet's next-hop address.
To ensure that its routing table contains up-to-date routing information, the intermediate network node may cooperate with other intermediate nodes to disseminate routing information representative of the current network topology. For example, suppose the intermediate network node detects that one of its neighboring nodes (i.e., adjacent network nodes) becomes unavailable, e.g., due to a link failure or the neighboring node going “off-line,” etc. In this situation, the intermediate network node can update the routing information stored in its routing table to ensure that data packets are not routed to the unavailable network node. Furthermore, the intermediate node also may communicate this change in network topology to the other intermediate network nodes so they, too, can update their local routing tables and bypass the unavailable node. In this manner, each of the intermediate network nodes becomes “aware” of the change in topology.
Typically, routing information is disseminated among the intermediate network nodes in accordance with a predetermined network communication protocol, such as a link-state protocol (e.g., IS-IS, or OSPF). Conventional link-state protocols use link-state advertisements or link-state packets (or “IGP Advertisements”) for exchanging routing information between interconnected intermediate network nodes (IGP nodes). As used herein, an IGP Advertisement generally describes any message used by an IGP routing protocol for communicating routing information among interconnected IGP nodes, i.e., routers and switches. Operationally, a first IGP node may generate an IGP Advertisement and “flood” (i.e., transmit) the packet over each of its network interfaces coupled to other IGP nodes. Thereafter, a second IGP node may receive the flooded IGP Advertisement and update its routing table based on routing information contained in the received IGP Advertisement. Next, the second IGP node may flood the received IGP Advertisement over each of its network interfaces, except for the interface at which the IGP Advertisement was received. This flooding process may be repeated until each interconnected IGP node has received the IGP Advertisement and updated its local routing table.
In practice, each IGP node typically generates and disseminates an IGP Advertisement whose routing information includes a list of the intermediate node's neighboring network nodes and one or more “cost” values associated with each neighbor. As used herein, a cost value associated with a neighboring node is an arbitrary metric used to determine the relative ease/burden of communicating with that node. For instance, the cost value may be measured in terms of the number of hops required to reach the neighboring node, the average time for a packet to reach the neighboring node, the amount of network traffic or available bandwidth over a communication link coupled to the neighboring node, etc.
As noted, IGP Advertisements are usually flooded until each intermediate network IGP node has received an IGP Advertisement from each of the other interconnected intermediate nodes, which may be stored in a link state database (LSDB). Then, each of the IGP nodes (e.g., in a link-state protocol) can construct the same “view” of the network topology by aggregating the received lists of neighboring nodes and cost values. To that end, each IGP node may input this received routing information to a “shortest path first” (SPF) calculation that determines the lowest-cost network paths that couple the intermediate node with each of the other network nodes. For example, the Dijkstra algorithm is a conventional technique for performing such a SPF calculation, as described in more detail in Section 12.2.4 of the text book Interconnections Second Edition, by Radia Perlman, published September 1999, which is hereby incorporated by reference as though fully set forth herein. Each IGP node updates the routing information stored in its local routing table based on the results of its SPF calculation. More specifically, the RIB updates the routing table to correlate destination nodes with next-hop interfaces associated with the lowest-cost paths to reach those nodes, as determined by the SPF calculation (notably, creating a “shortest path tree” or SPT, as will be understood by those skilled in the art).
In some computer networks, multiple independent network topologies may be supported over one physical network topology. This type of “multi-topology routing” (MTR) may be used (e.g., by link-state protocols) to influence the path certain types of traffic (e.g., voice, video, data, etc.) take over the network to reach their respective destinations. In this manner, traffic separation may be achieved across the network, such that certain links are available to certain types of traffic, while other links are available to other types of traffic. In particular, MTR may be used to prevent certain links from being used for certain types of traffic as well, such as, e.g., preventing video/voice traffic (requiring high QoS) from traversing low QoS links of the network. Each router of an MTR network computes a distinct SPT for each topology, and is aware of only those topologies to which the router belongs/participates. Conventionally, routers may either store/manage all topologies in a single instance (single RIB/LSDB), or may instead store/manage each topology in a separate instance corresponding to each MTR topology (multiple RIBs/LSDBs). MTR for link-state protocols (IS-IS and OSPF) is described further in the Internet Draft by Przygienda et al., entitled M-ISIS: Multi-Topology (MT) Routing in IS-IS<draft-ietf-isis-wg-multi-topology-11.txt>, dated October 2005, the Internet Draft by Previdi et al., entitled IS-IS Multi-instance Multi-topology <draft-previdi-isis-mi-mt-01.txt>, dated June 2006, and the Internet Draft by Psenak et al., entitled Multi-Topology (MT) Routing in OSPF <draft-ietf-ospf-mt-06.txt>, dated Feb. 1, 2006, the contents of all of which are hereby incorporated by reference as though fully set forth herein.
Occasionally, a network element (e.g., a node or link) will fail, causing redirection of the traffic that originally traversed the failed network element to other network elements that bypass the failure. Generally, notice of this failure is relayed to the nodes in the network through an advertisement of the new network topology, e.g., an IGP or BGP Advertisement, and routing tables are updated to avoid the failure accordingly. Reconfiguring a network in response to a network element failure using, e.g., pure IP rerouting, can be time consuming. Many recovery techniques, however, are available to provide fast recovery and/or network configuration in the event of a network element failure, including, inter alia, “Fast Reroute”, e.g., IP Fast Reroute (IP FRR) and tunneling FRR (e.g., MPLS TE FRR). An example of IP FRR is described in Shand, et al., IP Fast Reroute Framework <draft-ietf-rtgwg-ipfrr-framework-05.txt>, Internet Draft, March 2006, and in Atlas, et al., Basic Specification for IP Fast-Reroute: Loop-free Alternates <draft-ietf-rtgwg-ipfrr-spec-base-05>, Internet Draft, February 2006, the contents of both of which are hereby incorporated by reference as though fully set forth herein. An example of MPLS TE FRR is described in RFC 4090, entitled Fast Reroute Extensions to RSVP-TE for LSP Tunnels, dated May 2005, which is hereby incorporated by reference as though fully set forth herein.
IP FRR has been developed to protect against network element failures, where a protecting network node determines “Loop Free Alternates” (LFAs) of protected network elements to reach a particular destination. Specifically, a conventional LFA may generally be defined as an alternate next-hop node (i.e., not a current/selected next-hop node) or an alternate to other protected network elements (e.g., links) to the particular destination that does not loop back (return) to the protecting network device or the protected element (e.g., nodes/links) to reach that destination. For example, if a neighboring network device has selected the protecting network device as a next-hop to reach the destination sending traffic from the protecting network device to that neighboring network device (e.g., in the event of a network element failure) would result in a loop between the two devices (e.g., until the network re-converges to remove the failed network element). By employing an LFA when the protected network element fails, however, traffic may be diverted to the LFA in order to reach the destination without utilizing the failed network element, and without creating any loops.
In a tunneling FRR, “backup tunnels” are created to bypass a protected network element (e.g., links, shared risk link groups (SRLGs), and nodes). When the network element fails, traffic is quickly rerouted over a backup tunnel to bypass the failed element, or more particularly, in the case of MPLS, a set of TE-LSP(s) is quickly rerouted. Specifically, a protecting network node (e.g., the “point of local repair,” PLR) configured to reroute the traffic inserts (“pushes”) a new label for the backup tunnel, and the traffic is rerouted accordingly. Once the failed element is bypassed, the backup tunnel label is removed (“popped”), and the traffic is routed along the original path according to the next label (e.g., that of the original TE-LSP), or according to IP routing (if no original tunnel exists).
For both IP FRR and tunneling FRR, the LFAs or backup tunnels may generally be referred to as “repair paths,” in that they are used to repair a failed path (i.e., the original/protected path). When the repair paths are computed, the protecting network node inspects its routing database (e.g., its LSDB) to determine a repair path. In particular, when a single network topology is used, the LSDB used for the original/protected path (e.g., based on an SPT) is the same LSDB used for the repair path computation (i.e., it's the only LSDB/topology). When MTR is used, the protecting node conventionally protects a network element by inspecting the LSDB of the topology of the protected element, and determines an appropriate repair path within that topology (i.e., there is no MTR topology cross-over). (Notably, if the protected element belongs to more than one topology, a repair strategy, e.g., manually configured, may be used to determine the appropriate topology.) However, in the event the topology of the protected network element does not have an acceptable repair path, there is currently no known means available for a protecting node to utilize a different topology. For instance, a different topology may offer a path around the protected element, yet due to the underlying principle of MTR, the distinct topologies remain separate and unusable, even temporarily and acceptably (allowably), by other topologies. There remains a need, therefore, for a technique that allows a protecting node to utilize MTR topologies efficiently for repair paths (e.g., FRR), without compromising the integrity of MTR (i.e., substantially maintaining separate topologies).