1. Field of the Invention
The present invention relates to computer networks and more particularly to activation of secondary Traffic Engineering Label Switched Path (TE-LSP) upon failure of a primary TE-LSP having separate head-end nodes in a computer network.
2. Background Information
A computer network is a geographically distributed collection of nodes interconnected by communication links and segments for transporting data between end nodes, such as personal computers and workstations. Many types of networks are available, with the types ranging from local area networks (LANs) to wide area networks (WANs). LANs typically connect the nodes over dedicated private communications links located in the same general physical location, such as a building or campus. WANs, on the other hand, typically connect geographically dispersed nodes over long-distance communications links, such as common carrier telephone lines, optical lightpaths, synchronous optical networks (SONET), or synchronous digital hierarchy (SDH) links. The Internet is an example of a WAN that connects disparate networks throughout the world, providing global communication between nodes on various networks. The nodes typically communicate over the network by exchanging discrete frames or packets of data according to predefined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP). In this context, a protocol consists of a set of rules defining how the nodes interact with each other. Computer networks may be further interconnected by an intermediate network node, such as a router, to extend the effective “size” of each network.
Since management of interconnected computer networks can prove burdensome, smaller groups of computer networks may be maintained as routing domains or autonomous systems. The networks within an autonomous system (AS) are typically coupled together by conventional “intradomain” routers configured to execute intradomain routing protocols, and are generally subject to a common authority. To improve routing scalability, a service provider (e.g., an ISP) may divide an AS into multiple “areas” or “levels.” It may be desirable, however, to increase the number of nodes capable of exchanging data; in this case, interdomain routers executing interdomain routing protocols are used to interconnect nodes of the various ASes. Moreover, it may be desirable to interconnect various ASes that operate under different administrative domains. As used herein, an AS, area, or level is generally referred to as a “domain,” and a router that interconnects different domains is generally referred to as a “border router.”
An example of an inter-domain routing protocol is the Border Gateway Protocol version 4 (BGP), which performs routing between domains (ASes) by exchanging routing and reachability information among neighboring inter-domain routers of the systems. An adjacency is a relationship formed between selected neighboring (peer) routers for the purpose of exchanging routing information messages and abstracting the network topology. The routing information exchanged by BGP peer routers typically includes destination address prefixes, i.e., the portions of destination addresses used by the routing protocol to render routing (“next hop”) decisions. Examples of such destination addresses include IP version 4 (IPv4) and version 6 (IPv6) addresses. BGP generally operates over a reliable transport protocol, such as TCP, to establish a TCP connection/session. The BGP protocol is well known and generally described in Request for Comments (RFC) 1771, entitled A Border Gateway Protocol 4 (BGP-4), published March 1995.
Examples of an intradomain routing protocol, or an interior gateway protocol (IGP), are the Open Shortest Path First (OSPF) routing protocol and the Intermediate-System-to-Intermediate-System (IS-IS) routing protocol. The OSPF and IS-IS protocols are based on link-state technology and, therefore, are commonly referred to as link-state routing protocols. Link-state protocols define the manner with which routing information and network-topology information are exchanged and processed in a domain. This information is generally directed to an intradomain router's local state (e.g., the router's usable interfaces and reachable neighbors or adjacencies). The OSPF protocol is described in RFC 2328, entitled OSPF Version 2, dated April 1998 and the IS-IS protocol used in the context of IP is described in RFC 1195, entitled Use of OSI IS-IS for routing in TCP/IP and Dual Environments, dated December 1990, both of which are hereby incorporated by reference.
An intermediate network node often stores its routing information in a routing table maintained and managed by a routing information base (RIB). The routing table is a searchable data structure in which network addresses are mapped to their associated routing information. However, those skilled in the art will understand that the routing table need not be organized as a table, and alternatively may be another type of searchable data structure. Although the intermediate network node's routing table may be configured with a predetermined set of routing information, the node also may dynamically acquire (“learn”) network routing information as it sends and receives data packets. When a packet is received at the intermediate network node, the packet's destination address may be used to identify a routing table entry containing routing information associated with the received packet. Among other things, the packet's routing information indicates the packet's next-hop address.
To ensure that its routing table contains up-to-date routing information, the intermediate network node may cooperate with other intermediate nodes to disseminate routing information representative of the current network topology. For example, suppose the intermediate network node detects that one of its neighboring nodes (i.e., adjacent network nodes) becomes unavailable, e.g., due to a link failure or the neighboring node going “off-line,” etc. In this situation, the intermediate network node can update the routing information stored in its routing table to ensure that data packets are not routed to the unavailable network node. Furthermore, the intermediate node also may communicate this change in network topology to the other intermediate network nodes so they, too, can update their local routing tables and bypass the unavailable node. In this manner, each of the intermediate network nodes becomes “aware” of the change in topology.
Typically, routing information is disseminated among the intermediate network nodes in accordance with a predetermined network communication protocol, such as a link-state protocol (e.g., IS-IS, or OSPF). Conventional link-state protocols use link-state advertisements or link-state packets (or “IGP Advertisements”) for exchanging routing information between interconnected intermediate network nodes (IGP nodes). As used herein, an IGP Advertisement generally describes any message used by an IGP routing protocol for communicating routing information among interconnected IGP nodes, i.e., routers and switches. Operationally, a first IGP node may generate an IGP Advertisement and “flood” (i.e., transmit) the packet over each of its network interfaces coupled to other IGP nodes. Thereafter, a second IGP node may receive the flooded IGP Advertisement and update its routing table based on routing information contained in the received IGP Advertisement. Next, the second IGP node may flood the received IGP Advertisement over each of its network interfaces, except for the interface at which the IGP Advertisement was received. This flooding process may be repeated until each interconnected IGP node has received the IGP Advertisement and updated its local routing table.
In practice, each IGP node typically generates and disseminates an IGP Advertisement whose routing information includes a list of the intermediate node's neighboring network nodes and one or more “cost” values associated with each neighbor. As used herein, a cost value associated with a neighboring node is an arbitrary metric used to determine the relative ease/burden of communicating with that node. For instance, the cost value may be measured in terms of the number of hops required to reach the neighboring node, the average time for a packet to reach the neighboring node, the amount of network traffic or available bandwidth over a communication link coupled to the neighboring node, etc.
As noted, IGP Advertisements are usually flooded until each intermediate network IGP node has received an IGP Advertisement from each of the other interconnected intermediate nodes. Then, each of the IGP nodes (e.g., in a link-state protocol) can construct the same “view” of the network topology by aggregating the received lists of neighboring nodes and cost values. To that end, each IGP node may input this received routing information to a “shortest path first” (SPF) calculation that determines the lowest-cost network paths that couple the intermediate node with each of the other network nodes. For example, the Dijkstra algorithm is a conventional technique for performing such a SPF calculation, as described in more detail in Section 12.2.4 of the text book Interconnections Second Edition, by Radia Perlman, published September 1999, which is hereby incorporated by reference as though fully set forth herein. Each IGP node updates the routing information stored in its local routing table based on the results of its SPF calculation. More specifically, the RIB updates the routing table to correlate destination nodes with next-hop interfaces associated with the lowest-cost paths to reach those nodes, as determined by the SPF calculation.
Multi-Protocol Label Switching (MPLS) Traffic Engineering has been developed to meet data networking requirements such as guaranteed available bandwidth or fast restoration. MPLS Traffic Engineering exploits modern label switching techniques to build end-to-end tunnels based on a series of constraints through an IP/MPLS network of label switched routers (LSRs). These tunnels are a type of label switched path (LSP) and thus are generally referred to as MPLS Traffic Engineering (TE) LSPs. Examples of MPLS TE can be found in RFC 3209, entitled RSVP-TE: Extensions to RSVP for LSP Tunnels dated December 2001, RFC 3784 entitled Intermediate-System-to-Intermediate-System (IS-IS) Extensions for Traffic Engineering (TE) dated June 2004, and RFC 3630, entitled Traffic Engineering (TE) Extensions to OSPF Version 2 dated September 2003, the contents of all of which are hereby incorporated by reference in their entirety.
Establishment of an MPLS TE-LSP from a head-end LSR to a tail-end LSR involves computation of a path through a network of LSRs. Optimally, the computed path is the “shortest” path, as measured in some metric, that satisfies all relevant LSP Traffic Engineering constraints such as e.g., required bandwidth, “affinities” (administrative constraints to avoid or include certain links), etc. Path computation can either be performed by the head-end LSR or by some other entity operating as a path computation element (PCE) not co-located on the head-end LSR. The head-end LSR (or a PCE) exploits its knowledge of network topology and resources available on each link to perform the path computation according to the LSP Traffic Engineering constraints. Various path computation methodologies are available including CSPF (constrained shortest path first). MPLS TE-LSPs can be configured within a single domain, e.g., area, level, or AS, or may also span multiple domains, e.g., areas, levels, or ASes.
The PCE is an entity having the capability to compute paths between any nodes of which the PCE is aware in an AS or area. PCEs are especially useful in that they are more cognizant of network traffic and path selection within their AS or area, and thus may be used for more optimal path computation. A head-end LSR may further operate as a path computation client (PCC) configured to send a path computation request to the PCE, and receive a response with the computed path, which potentially takes into consideration other path computation requests from other PCCs. It is important to note that when one PCE sends a request to another PCE, it acts as a PCC. A PCC can be informed of a PCE either by pre-configuration by an administrator, or by a PCE Discovery (PCED) message (“advertisement”), which is sent from the PCE within its area or level or across the entire AS to advertise its services.
Some applications may incorporate unidirectional data flows configured to transfer time-sensitive traffic from a source (sender) in a computer network to a destination (receiver) in the network in accordance with a certain “quality of service” (QoS). Here, network resources may be reserved for the unidirectional flow to ensure that the QoS associated with the data flow is maintained. The Resource ReSerVation Protocol (RSVP) is a network-control protocol that enables applications to reserve resources in order to obtain special QoS for their data flows. RSVP works in conjunction with routing protocols to, e.g., reserve resources for a data flow in a computer network in order to establish a level of QoS required by the data flow. RSVP is defined in R. Braden, et al., Resource ReSerVation Protocol (RSVP), RFC 2205, the contents of which are hereby incorporated by reference in their entirety. In the case of traffic engineering applications, RSVP signaling is used to establish a TE-LSP and to convey various TE-LSP attributes to routers, such as border routers, along the TE-LSP obeying the set of required constraints whose path may have been computed by various means.
Generally, a tunnel is a logical structure that encapsulates a packet (a header and data) of one protocol inside a data field of another protocol packet with a new header. In this manner, the encapsulated data may be transmitted through networks that it would otherwise not be capable of traversing. More importantly, a tunnel creates a transparent virtual network link between two network nodes that is generally unaffected by physical network links or devices (i.e., the physical network links or devices merely forward the encapsulated packet based on the new header). While one example of a tunnel is an MPLS TE-LSP, other known tunneling methods include, inter alia, the Layer Two Tunnel Protocol (L2TP), the Point-to-Point Tunneling Protocol (PPTP), and IP tunnels.
A common practice in TE-enabled networks consists of deploying a mesh of TE-LSPs between a plurality of edge devices (provider edge, or PE routers) through a core network of fewer (generally large capacity) routers (provider, or P routers). In a mesh between PE routers (e.g., a “full mesh”), each PE router on one side of the core is connected to each PE router on the other side of the core via one or more TE-LSPs. The mesh of TE-LSPs provides various benefits within the network, as known to those skilled in the art, such as for providing redundancy to nodes connected to more than one PE router.
Occasionally, a network element (e.g., a node or link) fails, causing redirection of the traffic that originally traversed the failed network element to other network elements that bypass the failure. Generally, notice of this failure is relayed to the nodes in the same domain through an advertisement of the new network topology, e.g., an IGP Advertisement, and routing tables are updated to avoid the failure accordingly. Reconfiguring a network in response to a network element failure using, e.g., pure IP rerouting, can be time consuming. Many recovery techniques, however, are available to provide fast recovery and/or network configuration in the event of a network element failure, including, inter alia, Fast Reroute (FRR), e.g., MPLS TE FRR. An example of MPLS TE FRR is described in Pan, et al., Fast Reroute Extensions to RSVP-TE for LSP Tunnels, RFC 4090, dated May 2005, which is hereby incorporated by reference as though fully set forth herein.
FRR has been widely deployed to protect against network element failures, where “backup” or “secondary tunnels” are created and set up a priori (before the occurrence of the failure) to bypass a protected network element (e.g., links, shared risk link groups (SRLGs), and nodes). When the network element fails, traffic is quickly rerouted over a backup tunnel to bypass the failed element, or more particularly, in the case of MPLS, a set of TE-LSP(s) is quickly rerouted. Specifically, the point of local repair (PLR) configured to reroute the traffic inserts (“pushes”) a new label for the backup tunnel, and the traffic is rerouted accordingly. Once the failed element is bypassed, the backup tunnel label is removed (“popped”), and the traffic is routed along the original path according to the next label (e.g., that of the original TE-LSP, or that expected by the node receiving the rerouted TE-LSP). Notably, the backup tunnel, in addition to bypassing the failed element along a protected primary TE-LSP also intersects the primary TE-LSP, i.e., it begins and ends at nodes along the protected primary TE-LSP. As such, protection of head-end nodes (and tail-end nodes) may be difficult to accomplish.
One example, however, of a head-end node protection scheme is described in commonly-owned copending U.S. patent application Ser. No. 11/334,151, entitled PROTECTION AGAINST FAILURE OF A HEAD-END NODE OF ONE OR MORE TE-LSPS, filed by Vasseur on Jan. 18, 2006, the contents of which are hereby incorporated in its entirety. For instance, a prior-hop node to the head-end node, or an “up-stream neighboring node,” creates tunnels to a “next-next-hop” node from the prior-hop node (i.e., a “downstream neighboring node” of the head-end node) along the TE-LSP. In this manner, should the head-end node fail, techniques are described to allow the prior-hop node to perform FRR around the failed head-end node directly to the next-next-hop node.
Another example of head-end node protection utilizes a plurality of redundant head-end nodes. For instance, an ATM (Asynchronous Transfer Mode) switch or a media gateway (e.g., Voice over IP, “VoIP,” Video on Demand, “VoD,” etc.) may often be redundantly connected to PE routers of a provider network. (Notably, as will be understood by those skilled in the art, the ATM switch may be connected as a “pseudowire” arrangement, where ATM frames are encapsulated as MPLS IP packets to traverse an MPLS provider core network.) Each PE router may be the head-end of a separate (and redundant) TE-LSP to a particular destination. In the event of a failure of a first (primary) head-end node of a primary tunnel (TE-LSP), the ATM switch or media gateway (the prior-hop node) redirects traffic to a second (secondary) redundant head-end node of a second TE-LSP accordingly. This solution works particularly well where TE-LSPs are already deployed in the network (e.g., a full or partial mesh), such that the primary and secondary TE-LSPs already exist within the network. Specifically, the primary and secondary TE-LSPs already reserve an amount of bandwidth (BW) required for the traffic and the redirected traffic. Without a pre-deployed TE-LSP arrangement, then, the secondary TE-LSP reserves an amount of BW that would accommodate the primary TE-LSP traffic, even though the traffic is not yet utilizing the secondary TE-LSP. This “double booking” of resources may be considered to be wasteful by network administrators, and thus should be avoided. Notably, network administrators may recognize that “1 for 1” redundancy attributed to “double booking” (one backup/secondary for each primary) is more expensive than “N for 1” redundancy (one backup/secondary for multiply primaries) that is typically provided by packet switched networks.
An alternative to double-booking resources is to configure the secondary TE-LSP with zero BW (i.e., the secondary TE-LSP is signaled and has a state associated therewith, but reserves no BW resources). While a zero BW TE-LSP provides a redundant TE-LSP, it does not guarantee BW along the TE-LSP, and thus does not provide a particular benefit of a TE-LSP. One solution to this problem is to utilize an automatic BW adjustment/resizing technique, such as the “auto-bandwidth” or “auto-BW” technique. The auto-BW technique and dynamically sized TE-LSPs are described further in “Cisco MPLS AutoBandwidth Allocator for MPLS Traffic Engineering: A Unique New Feature of Cisco IOS Software,” a White Paper published by Cisco Systems, Inc., 2001, the contents of which are hereby incorporated by reference in their entirety. Specifically, a TE-LSP (e.g., the redundant zero BW TE-LSP) may periodically adjust the amount of reserved traffic to accommodate the amount of traffic currently utilizing the TE-LSP, e.g., eventually resizing a zero-BW TE-LSP to the full amount of the original primary TE-LSP due to the redirected traffic. However, those skilled in the art will understand that the auto-BW technique may be inefficient and slow, due to the measurement of data, calculation of data, and gradual adjustments of the BW. In other words, while the secondary zero-BW TE-LSP may eventually reserve the amount of BW reserved for the primary TE-LSP, the reservation is not immediate (e.g., on the order of several minutes), possibly resulting in other problems, such as lost traffic, degraded quality of service, inability to locate sufficient BW capacity, etc., as will be understood by those skilled in the art.
In addition, there are many types of primary TE-LSP “failures” that may result in the primary TE-LSP remaining in tact. For example, a link failure between the prior-hop node and the primary head-end node may not cause the primary TE-LSP to be torn down. While the primary head-end node maintains the primary TE-LSP, the prior-hop node may direct traffic to the secondary head-end node's TE-LSP. The reserved resources of the primary TE-LSP are unutilized, wasteful, and may preclude the establishment of other TE-LSPs that share common resources between the primary head-end node and tail-end node.
There remains a need, therefore, for a system and method for protecting against a failure of a TE-LSP, including the head-end node of the TE-LSP, that does not require double-booking of network resources. In particular, a need remains to more rapidly adjust the reserved BW of a secondary TE-LSP to accommodate rapidly redirected traffic due to the failure.