Field of the Invention
The present invention relates to computer networks and more particularly to dynamically determining whether to reestablish a Fast Rerouted primary tunnel based on path quality feedback of a utilized backup tunnel in a computer network.
Background Information
A computer network is a geographically distributed collection of nodes interconnected by communication links and segments for transporting data between end nodes, such as personal computers and workstations. Many types of networks are available, with the types ranging from local area networks (LANs) to wide area networks (WANs). LANs typically connect the nodes over dedicated private communications links located in the same general physical location, such as a building or campus. WANs, on the other hand, typically connect geographically dispersed nodes over long-distance communications links, such as common carrier telephone lines, optical lightpaths, synchronous optical networks (SONET), or synchronous digital hierarchy (SDH) links. The Internet is an example of a WAN that connects disparate networks throughout the world, providing global communication between nodes on various networks. The nodes typically communicate over the network by exchanging discrete frames or packets of data according to predefined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP). In this context, a protocol consists of a set of rules defining how the nodes interact with each other. Computer networks may be further interconnected by an intermediate network node, such as a router, to extend the effective “size” of each network.
Since management of interconnected computer networks can prove burdensome, smaller groups of computer networks may be maintained as routing domains or autonomous systems. The networks within an autonomous system (AS) are typically coupled together by conventional “intradomain” routers configured to execute intradomain routing protocols, and are generally subject to a common authority. To improve routing scalability, a service provider (e.g., an ISP) may divide an AS into multiple “areas.” It may be desirable, however, to increase the number of nodes capable of exchanging data; in this case, interdomain routers executing interdomain routing protocols are used to interconnect nodes of the various ASes. Moreover, it may be desirable to interconnect various ASes that operate under different administrative domains. As used herein, an AS or an area is generally referred to as a “domain,” and a router that interconnects different domains together is generally referred to as a “border router.”
An example of an interdomain routing protocol is the Border Gateway Protocol version 4 (BGP), which performs routing between domains (ASes) by exchanging routing and reachability information among neighboring interdomain routers of the systems. An adjacency is a relationship formed between selected neighboring (peer) routers for the purpose of exchanging routing information messages and abstracting the network topology. The routing information exchanged by BGP peer routers typically includes destination address prefixes, i.e., the portions of destination addresses used by the routing protocol to render routing (“next hop”) decisions. Examples of such destination addresses include IP version 4 (IPv4) and version 6 (IPv6) addresses. BGP generally operates over a reliable transport protocol, such as TCP, to establish a TCP connection/session. The BGP protocol is well known and generally described in Request for Comments (RFC) 1771, entitled A Border Gateway Protocol 4 (BGP-4), published March 1995.
Examples of an intradomain routing protocol, or an interior gateway protocol (IGP), are the Open Shortest Path First (OSPF) routing protocol and the Intermediate-System-to-Intermediate-System (IS-IS) routing protocol. The OSPF and IS-IS protocols are based on link-state technology and, therefore, are commonly referred to as link-state routing protocols. Link-state protocols define the manner with which routing information and network-topology information are exchanged and processed in a domain. This information is generally directed to an intradomain router's local state (e.g., the router's usable interfaces and reachable neighbors or adjacencies). The OSPF protocol is described in RFC 2328, entitled OSPF Version 2, dated April 1998 and the IS-IS protocol used in the context of IP is described in RFC 1195, entitled Use of OSI IS-IS for routing in TCP/IP and Dual Environments, dated December 1990, both of which are hereby incorporated by reference.
An intermediate network node often stores its routing information in a routing table maintained and managed by a routing information base (RIB). The routing table is a searchable data structure in which network addresses are mapped to their associated routing information. However, those skilled in the art will understand that the routing table need not be organized as a table, and alternatively may be another type of searchable data structure. Although the intermediate network node's routing table may be configured with a predetermined set of routing information, the node also may dynamically acquire (“learn”) network routing information as it sends and receives data packets. When a packet is received at the intermediate network node, the packet's destination address (e.g., stored in a header of the packet) may be used to identify a routing table entry containing routing information associated with the received packet. Among other things, the packet's routing information indicates the packet's next-hop address.
To ensure that its routing table contains up-to-date routing information, the intermediate network node may cooperate with other intermediate nodes to disseminate routing information representative of the current network topology. For example, suppose the intermediate network node detects that one of its neighboring nodes (i.e., adjacent network nodes) becomes unavailable, e.g., due to a link failure or the neighboring node going “off-line,” etc. In this situation, the intermediate network node can update the routing information stored in its routing table to ensure that data packets are not routed to the unavailable network node. Furthermore, the intermediate node also may communicate this change in network topology to the other intermediate network nodes so they, too, can update their local routing tables and bypass the unavailable node. In this manner, each of the intermediate network nodes becomes “aware” of the change in topology.
Multi-Protocol Label Switching (MPLS) Traffic Engineering has been developed to meet data networking requirements such as guaranteed available bandwidth or fast restoration. MPLS Traffic Engineering exploits modern label switching techniques to build guaranteed bandwidth end-to-end tunnels through an IP/MPLS network of label switched routers (LSRs). These tunnels are a type of label switched path (LSP) and thus are generally referred to as MPLS Traffic Engineering (TE) LSPs. Examples of MPLS TE can be found in RFC 3209, entitled RSVP-TE: Extensions to RSVP for LSP Tunnels dated December 2001, RFC 3784 entitled Intermediate-System-to-Intermediate-System (IS-IS) Extensions for Traffic Engineering (TE) dated June 2004, and RFC 3630, entitled Traffic Engineering (TE) Extensions to OSPF Version 2 dated September 2003, the contents of all of which are hereby incorporated by reference in their entirety.
Establishment of an MPLS TE-LSP from a head-end LSR to a tail-end LSR involves computation of a path through a network of LSRs. Optimally, the computed path is the “shortest” path, as measured in some metric, that satisfies all relevant LSP Traffic Engineering constraints such as e.g., required bandwidth, “affinities” (administrative constraints to avoid or include certain links), etc. Path computation can either be performed by the head-end LSR or by some other entity operating as a path computation element (PCE) not co-located on the head-end LSR. The head-end LSR (or a PCE) exploits its knowledge of network topology and resources available on each link to perform the path computation according to the LSP Traffic Engineering constraints. Various path computation methodologies are available including CSPF (constrained shortest path first). MPLS TE-LSPs can be configured within a single domain, e.g., area, level, or AS, or may also span multiple domains, e.g., areas, levels, or ASes.
The PCE is an entity having the capability to compute paths between any nodes of which the PCE is aware in an AS or area. PCEs are especially useful in that they are more cognizant of network traffic and path selection within their AS or area, and thus may be used for more optimal path computation. A head-end LSR may further operate as a path computation client (PCC) configured to send a path computation request to the PCE, and receive a response with the computed path, potentially taking into consideration other path computation requests from other PCCs. It is important to note that when one PCE sends a request to another PCE, it acts as a PCC.
Some applications may incorporate unidirectional data flows configured to transfer time-sensitive traffic from a source (sender) in a computer network to a destination (receiver) in the network in accordance with a certain “quality of service” (QoS). Here, network resources may be reserved for the unidirectional flow to ensure that the QoS associated with the data flow is maintained. The Resource ReSerVation Protocol (RSVP) is a network-control protocol that enables applications to reserve resources in order to obtain special QoS for their data flows. RSVP works in conjunction with routing protocols to, e.g., reserve resources for a data flow in a computer network in order to establish a level of QoS required by the data flow. RSVP is defined in R. Braden, et al., Resource ReSerVation Protocol (RSVP), RFC 2205, the contents of which are hereby incorporated by reference in its entirety. In the case of traffic engineering applications, RSVP signaling (with Traffic Engineering extensions) is used to establish a TE-LSP and to convey various TE-LSP attributes to routers, such as border routers, along the TE-LSP obeying the set of required constraints whose path may have been computed by various means.
Generally, a tunnel is a logical structure that encapsulates a packet (a header and data) of one protocol inside a data field of another protocol packet with a new header. In this manner, the encapsulated data may be transmitted through networks that it would otherwise not be capable of traversing. More importantly, a tunnel creates a transparent virtual network link between two network nodes that is generally unaffected by physical network links or devices (i.e., the physical network links or devices merely forward the encapsulated packet based on the new header). While one example of a tunnel is an MPLS TE-LSP, other known tunneling methods include, inter alia, the Layer Two Tunnel Protocol (L2TP), the Point-to-Point Tunneling Protocol (PPTP), and IP tunnels.
Occasionally, a network element (e.g., a node or link) will fail, causing redirection of the traffic that originally traversed the failed network element to other network elements that bypass the failure. Generally, notice of this failure is relayed to the nodes in the network through an advertisement of the new network topology, e.g., an IGP or BGP Advertisement, and routing tables are updated to avoid the failure accordingly. Reconfiguring a network in response to a network element failure using, e.g., pure IP rerouting, can be time consuming. Many recovery techniques, however, are available to provide fast recovery and/or network configuration in the event of a network element failure, including, inter alia, “Fast Reroute”, e.g., MPLS TE Fast Reroute. An example of MPLS TE Fast Reroute is described in Pan, et al., Fast Reroute Extensions to RSVP-TE for LSP Tunnels, RFC 4090, May 2005, which is hereby incorporated by reference as though fully set forth herein.
Fast Reroute (or FRR) has been widely deployed to protect against network element failures, where “backup tunnels” are created to bypass one or more protected network elements (e.g., links, shared risk link groups (SRLGs), and nodes). When the network element fails, traffic is quickly diverted (“Fast Rerouted”) over a backup tunnel to bypass the failed element, or more particularly, in the case of MPLS, a set of primary TE-LSPs (tunnels) is quickly diverted. Specifically, the point of local repair (PLR) node configured to reroute the traffic inserts (“pushes”) a new label for the backup tunnel, and the traffic is diverted accordingly. Once the failed element is bypassed, the backup tunnel label is removed (“popped”), and the traffic is routed along the original path according to the next label (e.g., that of the original TE-LSP). Notably, the backup tunnel, in addition to bypassing the failed element along a protected primary TE-LSP, also intersects the primary TE-LSP, i.e., it begins and ends at nodes along the protected primary TE-LSP.
To offer maximum protection, e.g., guaranteed bandwidth, during Fast Reroute, backup tunnels may reserve a configurable amount of bandwidth to ensure that Fast Rerouted traffic from the primary tunnel has a reserved path to follow. For example, the bandwidth reserved for the primary tunnel may also be reserved on the backup tunnel. While this approach provides maximum protection, it also requires a non-negligible amount of network resources (e.g., capacity/bandwidth) and may increase operational complexity.
Certain techniques are available to efficiently minimize the amount of resources required by the establishment and maintenance of the backup tunnels for Fast Reroute. One such technique is to create zero-bandwidth (“0-BW”) backup tunnels (i.e., tunnels that reserve no bandwidth) to protect non-0-BW primary tunnels. This “best effort” approach does not guarantee that the path followed by the backup tunnel will have enough bandwidth to support the diverted primary tunnel at the time of failure without QoS degradation, however in many situations the path quality of the backup tunnel is sufficient. For instance, if the network is not overly congested, or the backup tunnel follows a non-congested path, there may be enough available bandwidth along the backup tunnel to support the newly rerouted traffic. Also, because primary tunnels often reserve bandwidth in response to “peak” traffic utilization, the amount of traffic over the primary tunnel at the time of failure may be far less than the reserved bandwidth (e.g., at “off-peak” times). Notably, those skilled in the art will understand that in the absence of the above exceptions, a 0-BW backup tunnel may have unacceptable bandwidth (e.g., affecting path quality) to support the diverted traffic.
Currently, head-end nodes (LSRs) may be configured to systematically reroute the primary tunnels affected by the network element failure (e.g., diverted primary tunnels), especially in the case with 0-BW backup tunnels, such as, e.g., by reestablishing a new primary tunnel that follows a path excluding the failed network element. In particular, 0-BW backup tunnels represent a best effort attempt to allow the head-end node to more gracefully reestablish the primary tunnel in response to a failure, since the backup tunnels may not be able to support the diverted traffic without QoS degradation. The systematic reestablishing may potentially result in the reestablishment of a large number of primary tunnels (e.g., up to 3000 for a single network element failure in today's networks). Notably, reestablishing diverted primary tunnels may be undesirable for the network, such as by creating traffic churn, jitter, control plane overloads, etc., as will be understood by those skilled in the art. However, as noted above, there are situations where the backup tunnel may provide acceptable bandwidth, at least, for example, for a period of time (e.g., possibly short) until the failed network element is restored. In these situations, then, it may have been unnecessary to reestablish the diverted primary tunnels. There remains a need, therefore, for a technique that dynamically determines whether to reestablish a diverted primary tunnel based on path quality feedback of a utilized backup tunnel in a computer network.