1. Field of the Invention
The present invention relates to computer networks and more particularly to traffic engineering (TE) between customer edge devices (CEs) across a provider network in a computer network.
2. Background Information
A computer network is a geographically distributed collection of interconnected subnetworks, such as local area networks (LAN) that transport data between network nodes. As used herein, a network node is any device adapted to send and/or receive data in the computer network. Thus, in this context, “node” and “device” may be used interchangeably. The network topology is defined by an arrangement of network nodes that communicate with one another, typically through one or more intermediate nodes, such is as routers and switches. In addition to intra-network communications, data also may be exchanged between neighboring (i.e., adjacent) networks. To that end, “edge devices” located at the logical outer-bound of the computer network may be adapted to send and receive inter-network communications. Both inter-network and intra-network communications are typically effected by exchanging discrete packets of data according to predefined protocols. In this context, a protocol consists of a set of rules defining how network nodes interact with each other.
Each data packet typically comprises “payload” data prepended (“encapsulated”) by at least one network header formatted in accordance with a network communication protocol. The network headers include information that enables network nodes to efficiently route the packet through the computer network. Often, a packet's network headers include a data-link (layer 2) header, an internetwork (layer 3) header and a transport (layer 4) header as defined by the Transmission Control Protocol/Internet Protocol (TCP/IP) Reference Model. The TCP/IP Reference Model is generally described in more detail in Section 1.4.2 of the reference book entitled Computer Networks, Fourth Edition, by Andrew Tanenbaum, published 2003, which is hereby incorporated by reference as though fully set forth herein. A data packet may originate at a source node and subsequently “hop” from node to node along a logical data path until it reaches its addressed destination node. The network addresses defining the logical data path of a data flow are most often stored as Internet Protocol (IP) addresses in the packet's internetwork header.
A computer network may contain smaller groups of one or more subnetworks which may be managed as separate routing domains. As used herein, a routing domain is broadly construed as a collection of interconnected network nodes under a common administration. Often, a routing domain is managed by a single administrative entity, such as a company, an academic institution or a branch of government. Such a centrally-managed routing domain is sometimes referred to as an “autonomous system” or AS. In general, a routing domain may operate as an enterprise network, a service provider or any other type of network or subnetwork. Further, the routing domain may contain one or more edge devices having “peer” connections to edge devices in adjacent routing domains.
Network nodes within a routing domain are typically configured to forward data using predetermined paths from “interior gateway” routing protocols, such as conventional link-state protocols and distance-vector protocols. These interior gateway protocols (IGPs) define the manner with which routing information and network-topology information are exchanged and processed in the routing domain. The routing information exchanged (e.g., by IGP messages) typically includes destination address prefixes, i.e., the portions of destination addresses used by the routing protocol to render routing (“next hop”) decisions. Examples of such destination addresses include IP version 4 (IPv4) and version 6 (IPv6) addresses. As such, each intermediate node receives a consistent “view” of the domain's topology. Examples of link-state and distance-vectors protocols known in the art, such as the Open Shortest Path First (OSPF) protocol and Routing Information Protocol (RIP), are described in Sections 12.1-12.3 of the reference book entitled Interconnections, Second Edition, by Radia Perlman, published January 2000, which is hereby incorporated by reference as though fully set forth herein.
The Border Gateway Protocol (BGP) is usually employed as an “external gateway” routing protocol for routing data between autonomous systems. BGP is well known and generally described in Request for Comments (RFC) 1771, entitled A Border Gateway Protocol 4 (BGP-4), by Y. Rekhter et al., published March 1995, which is publicly available through the Internet Engineering Task Force (IETF) and is hereby incorporated by reference in its entirety. External (or exterior) BGP (eBGP) is often used to exchange routing information across routing domain boundaries. Internal BGP (iBGP) is a variation of the eBGP protocol and is often used to distribute inter-network reachability information (address prefixes) among BGP-enabled edge devices situated within the same routing domain. BGP generally operates over a reliable transport protocol, such as TCP, to establish a TCP connection/BGP session. BGP also may be extended for compatibility with services other than standard Internet connectivity. For instance, Multi-Protocol BGP (MP-BGP) supports various address family identifier (AFI) fields that permit BGP messages to transport multi-protocol information, such as is the case with RFC 2547 services, discussed below.
A network node within a routing domain may detect a change in the domain's topology. For example, the node may become unable to communicate with one of its neighboring nodes, e.g., due to a link failure between the nodes or the neighboring node failing, such as going “off line,” etc. If the detected node or link failure occurred within the routing domain, the detecting node may advertise the intra-domain topology change to other nodes in the domain using IGP messages. Similarly, if an edge device detects a node or link failure that prevents communications with a neighboring routing domain, the edge device may disseminate the inter-domain topology change to other edge devices within its routing domain (e.g., using the iBGP protocol). In either case, propagation of the network-topology change occurs within the routing domain and nodes in the domain thus converge on a consistent view of the new network topology, i.e., without the failed node or link.
A virtual private network (VPN) is a collection of network nodes that establish private communications over a shared backbone network. Previously, VPNs were implemented by embedding private leased lines in the shared network. The leased lines (i.e., communication links) were reserved only for network traffic among those network nodes participating in the VPN. Today, the above-described VPN implementation has been mostly replaced by private “virtual circuits” deployed in public networks. Specifically, each virtual circuit defines a logical end-to-end data path between a pair of network nodes participating in the VPN. When the pair of nodes is located in different routing domains, edge devices in a plurality of interconnected routing domains may have to co-operate to establish the nodes' virtual circuit.
A virtual circuit may be established using, for example, conventional layer-2 Frame Relay (FR) or Asynchronous Transfer Mode (ATM) networks. Alternatively, the virtual circuit may “tunnel” data between its logical end points using known layer-2 and/or layer-3 tunneling protocols, such as the Layer-2 Tunneling Protocol (L2TP) and the Generic Routing Encapsulation (GRE) protocol. In this case, one or more tunnel headers are prepended to a data packet to appropriately route the packet along the virtual circuit. The Multi-Protocol Label Switching (MPLS) protocol may be used as a tunneling mechanism for establishing layer-2 virtual circuits or layer-3 network-based VPNs through an IP network.
MPLS Traffic Engineering has been developed to meet data networking requirements such as guaranteed available bandwidth or fast restoration. MPLS Traffic Engineering exploits modern label switching techniques to build guaranteed bandwidth end-to-end tunnels through an IP/MPLS network of label switched routers (LSRs). These tunnels are a type of label switched path (LSP) and thus are generally referred to as MPLS Traffic Engineering (TE) LSPs. Examples of MPLS TE can be found in RFC 3209, entitled RSVP-TE: Extensions to RSVP for LSP Tunnels dated December 2001, RFC 3784 entitled Intermediate-System-to-Intermediate-System (IS-IS) Extensions for Traffic Engineering (TE) dated June 2004, and RFC 3630, entitled Traffic Engineering (TE) Extensions to OSPF Version 2 dated September 2003, the contents of all of which are hereby incorporated by reference in their entirety.
Establishment of an MPLS TE-LSP from a head-end LSR to a tail-end LSR involves computation of a path through a network of LSRs. Optimally, the computed path is the “shortest” path, as measured in some metric, that satisfies all relevant LSP Traffic Engineering constraints such as e.g., required bandwidth, “affinities” (administrative constraints to avoid or include certain links), etc. Path computation can either be performed by the head-end LSR or by some other entity operating as a path computation element (PCE) not co-located on the head-end LSR. The head-end LSR (or a PCE) exploits its knowledge of network topology and resources available on each link to perform the path computation according to the LSP Traffic Engineering constraints. Various path computation methodologies are available including CSPF (constrained shortest path first). MPLS TE-LSPs can be configured within a single domain, e.g., area, level, or AS, or may also span multiple domains, e.g., areas, levels, or ASes.
The PCE is an entity having the capability to compute paths between any nodes of which the PCE is aware in an AS or area. PCEs are especially useful in that they are more cognizant of network traffic and path selection within their AS or area, and thus may be used for more optimal path computation. A head-end LSR may further operate as a path computation client (PCC) configured to send a path computation request to the PCE, and receive a response with the computed path, which potentially takes into consideration other path computation requests from other PCCs. It is important to note that when one PCE sends a request to another PCE, it acts as a PCC. PCEs conventionally have limited or no visibility outside of their surrounding area(s), level(s), or AS. A PCC can be informed of a PCE either by pre-configuration by an administrator, or by a PCE Discovery (PCED) message (“advertisement”), which is sent from the PCE within its area or level or across the entire AS to advertise its services.
One difficulty that arises in crossing domain boundaries is that path computation at the head-end LSR requires knowledge of network topology and resources across the entire network between the head-end and the tail-end LSRs. Yet service providers typically do not share this information with each other across domain borders. In particular, network topology and resource information do not generally flow across area boundaries even though a single service provider may operate all the areas. Neither the head-end LSR nor any single PCE will generally have sufficient knowledge to compute a path where the LSR or PCE may not have the required knowledge should the destination not reside in a directly attached domain. Because of this, MPLS Traffic Engineering path computation techniques are required to compute inter-domain TE-LSPs.
In order to extend MPLS TE-LSPs across domain boundaries, the use of PCEs may be configured as a distributed system, where multiple PCEs collaborate to compute an end-to-end path (also referred to as “Multi-PCE path computation”). An example of such a distributed PCE architecture is described in commonly-owned co-pending U.S. patent application Ser. No. 10/767,574, entitled COMPUTING INTER-AUTONOMOUS SYSTEM MPLS TRAFFIC ENGINEERING LSP PATHS, filed by Vasseur et al., on Sep. 18, 2003, the contents of which are hereby incorporated by reference in its entirety. In a distributed PCE architecture, the visibility needed to compute paths is extended between adjacent domains so that PCEs may cooperate to compute paths across multiple domains by exchanging virtual shortest path trees (VSPTs) while preserving confidentiality across domains (e.g., when applicable to ASes).
Some applications may incorporate unidirectional data flows configured to transfer time-sensitive traffic from a source (sender) in a computer network to a destination (receiver) in the network in accordance with a certain “quality of service” (QoS). Here, network resources may be reserved for the unidirectional flow to ensure that the QoS associated with the data flow is maintained. The Resource ReSerVation Protocol (RSVP) is a network-control protocol that enables applications to reserve resources in order to obtain special QoS for their data flows. RSVP works in conjunction with routing protocols to, e.g., reserve resources for a data flow in a computer network in order to establish a level of QoS required by the data flow. RSVP is defined in R. Braden, et al., Resource ReSerVation Protocol (RSVP), RFC 2205. In the case of traffic engineering applications, RSVP signaling is used to establish a TE-LSP and to convey various TE-LSP attributes to routers, such as border routers, along the TE-LSP obeying the set of required constraints whose path may have been computed by various means.
Layer-3 network-based VPN services that utilize MPLS technology are often deployed by network service providers for one or more customer sites. These networks are typically said to provide “MPLS/VPN” services. As used herein, a customer site is broadly defined as a routing domain containing at least one customer edge device (CE) coupled to a provider edge device (PE) in the service provider's network (“provider network”). The customer site may be multi-homed to the provider network, i.e., wherein one or more of the customer's CEs is coupled to a plurality of PEs, thus providing a redundant connection. The PEs and CEs are generally intermediate network nodes, such as routers or switches, located at the edges of their respective networks. PE-CE links may be established over various physical media, such as conventional wire links, optical links, wireless links, etc., and may communicate data formatted using various network communication protocols including ATM, Frame Relay, Ethernet, Fibre Distributed Data Interface (FDDI), etc. In addition, the PEs and CEs may be configured to exchange routing information over their respective PE-CE links in accordance with various interior and exterior gateway protocols, such as BGP, OSPF, IS-IS, RIP, etc.
In the traditional MPLS/VPN network architecture, each customer site may participate in one or more different VPNs. Most often, each customer site is associated with a single VPN, and hereinafter the illustrative embodiments will assume a one-to-one correspondence between customer sites and VPNs. For example, customer sites owned or managed by a common administrative entity, such as a corporate enterprise, may be statically assigned to the enterprise's VPN. As such, network nodes situated in the enterprise's various customer sites participate in the same VPN and are therefore permitted to securely communicate with one another via the provider network. In other words, the provider network establishes the necessary LSPs to interconnect the customer sites participating in the enterprise's VPN. Likewise, the provider network also may establish LSPs that interconnect customer sites participating in other VPNs. This widely-deployed MPLS/VPN architecture is generally described in more detail in Chapters 8-9 of the reference book entitled MPLS and VPN Architecture, Volume 1, by I. Pepelnjak et al., published 2001 and in the IETF publication RFC 2547, entitled BGP/MPLS VPNs, by E. Rosen et al., published March 1999, each of which is hereby incorporated by reference as though fully set forth herein.
One problem associated with MPLS/VPN networks is their current inability to distribute TE information regarding PE-CE links across the provider network to other PEs. Traffic Engineering (TE), generally, refers to utilizing TE information to engineer (compute, determine, detect, etc.) traffic, such as for computing paths, creating TE-LSPs (e.g., MPLS TE-LSPs), load-balancing traffic, etc., as will be understood by those skilled in the art. Examples of TE information, e.g., as described in above-referenced RFCs 3784 and 3630, comprise, inter alia, the dynamically measured IP bandwidth, reservable MPLS bandwidth, unreserved bandwidth, administrative group (color), TE metric, etc.
One solution for distributing static link bandwidth of PE-CE links (or, more generally, an AS exit link) has been described in the document entitled BGP Link Bandwidth, published by Cisco Systems, Inc., March 2005, which is hereby incorporated by reference as though fully set forth herein. Here, the static link bandwidth of the PE-CE link (i.e., the maximum link capacity) may be advertised to BGP neighbors (e.g., other PEs). However, this solution does not provide TE information of the PE-CE links, such as, e.g., the dynamically measured IP bandwidth, reserved MPLS bandwidth, color, etc. of the PE-CE links.
Another solution to distribute TE information of PE-CE links is to leak the information into the provider network (the “core”), such as through IGP messages. This solution suffers numerous problems, however, such as VPN private addressing constraints, i.e., where CEs of different VPNs may share the same address, which may cause route confusion at receiving devices. Also, a lack of scalability may exist considering the possible number of PE-CE links (e.g., hundreds of thousands), which may surpass the limitations of internal route leaking (e.g., of IGP messages), thus possibly causing fragmented messages, error messages, etc., as will be understood by those skilled in the art. This lack of scalability may also apply to attempts to manually configure TE information, which would be overly cumbersome given the dynamic nature of TE information.
As a result of the inability to efficiently distribute dynamic TE information, various TE techniques may not be applied to the PE-CE links from other PEs not attached to the PE-CE links. In particular, TE techniques may not be applied to paths from one CE to another CE across the provider network (“CE-CE paths”). For example, current MPLS/VPN networks do not have the ability to efficiently create TE tunnels (e.g., MPLS TE-LSPs) along CE-CE paths, such as for reserved bandwidth, fast convergence, fast re-route (FRR), diverse paths, etc. Service providers and customers often desire to have these and other TE techniques/benefits applied to PE-CE links and CE-CE paths in their provider/customer network (e.g., MPLS/VPN networks), such as for backup data centers, voice over IP (VoIP) traffic (e.g., C4 switches to carry legacy voice traffic), etc.
There remains a need, therefore, for a technique that expands the TE topology of a provider/customer network (e.g., an MPLS/VPN network) to include the TE information of PE-CE links, such that various TE techniques may be applied to the network. There also remains a need for a technique for creating TE-LSPs along CE-CE paths, and for applying PCE techniques to MPLS/VPNs, generally.