A computer network is a geographically distributed collection of interconnected subnetworks, such as local area networks (LAN), that transport data between network nodes. As used herein, a network node is any device adapted to send and/or receive data in the computer network. Thus, in the context of this disclosure, the terms “node” and “device” may be used interchangeably. The network topology is defined by an arrangement of network nodes that communicate with one another, typically through one or more intermediate network nodes, such as routers and switches. In addition to intra-network communications between network nodes located in the same network, data also may be exchanged between nodes located in different networks. To that end, an “edge device” located at the logical outer-bound of a first computer network may be adapted to send and receive data with an edge device situated in a neighboring (i.e., adjacent) network. Inter-network and intra-network communications are typically effected by exchanging discrete packets of data according to predefined protocols. In this context, a protocol consists of a set of rules defining how network nodes interact with each other.
Each data packet typically comprises “payload” data prepended (“encapsulated”) by at least one network header formatted in accordance with a network communication protocol. The network headers include information that enables network nodes to efficiently route the packet through the computer network. Often, a packet's network headers include a data-link (layer 2) header, an internetwork (layer 3) header and a transport (layer 4) header as defined by the Transmission Control Protocol/Internet Protocol (TCP/IP) Reference Model. The TCP/IP Reference Model is generally described in more detail in Section 1.4.2 of the reference book entitled Computer Networks, Fourth Edition, by Andrew Tanenbaum, published 2003, which is hereby incorporated by reference as though fully set forth herein.
A data packet may originate at a source node and subsequently “hop” from node to node along a logical data path until it reaches its destination. The network addresses defining the logical data path of a data flow are most often stored as Internet Protocol (IP) addresses in the packet's internetwork header. IP addresses are typically formatted in accordance with the IP Version 4 (IPv4) protocol, in which network nodes are addressed using 32 bit (four byte) values. Although IPv4 is prevalent in most networks today, IP Version 6 (IPv6) has been introduced to increase the length of an IP address to 128 bits (16 bytes), thereby increasing the number of available IP addresses. Typically, a network or subnetwork is allocated a predetermined set of IP addresses which may be assigned to network nodes situated within that network or subnetwork. Here, a subnetwork is a subset of a larger computer network, and thus network nodes in the subnetwork may be configured to communicate with nodes located in other subnetworks.
A subnet mask may be used to select a set of contiguous high-order bits from IP addresses within a subnetwork's allotted address space. A subnet mask length indicates the number of contiguous high-order bits selected by the subnet mask, and a subnet mask length of N bits is hereinafter represented as IN. The subnet mask length for a given subnetwork is typically selected based on the number of bits required to distinctly address nodes in that subnetwork. As used herein, an “address prefix” is defined as the result of applying a subnet mask to a network address, such as an IP address. An address prefix therefore specifies a range of network addresses in a subnetwork, and in IPv4 a/32 address prefix corresponds to a particular network address. A “route” is defined herein as an address prefix and its associated path attributes. The path attributes generally include any information that characterizes the address prefix, and may include various protocol-specific attributes, such as conventional Border Gateway Protocol attributes.
Interior Gateway Protocols (IGP)
A computer network may contain smaller groups of one or more subnetworks which may be managed as separate autonomous systems. As used herein, an autonomous system (AS) is broadly construed as a collection of interconnected network nodes under a common administration. Often, the AS is managed by a single administrative entity, such as a company, an academic institution or a branch of government. For instance, the AS may operate as an enterprise network, a service provider or any other type of network or subnetwork. Each AS is typically assigned a unique identifier, such as a unique AS number, that identifies the AS among a plurality of ASes in a computer network.
An AS may contain one or more edge devices (or “autonomous system border routers” (ASBR)), having peer connections to other edge devices located in adjacent networks or subnetworks. Thus, packets enter or exit the AS through an appropriate ASBR. The AS may be logically partitioned into a plurality of different “routing areas.” Each routing area includes a designated set of network nodes that are configured to share routing and topology information. As such, the network nodes in a routing area share a consistent “view” of the network topology. Since consistent sets of intra-area, inter-area and inter-AS routing information are usually distributed among network nodes in an AS, the nodes can calculate consistent sets of “best paths” through the AS, e.g., using conventional shortest path first (SPF) calculations or other routing computations. A calculated best path corresponds to a preferred data path for transporting data between a pair of source and destination nodes. The best path may be an intra-area, inter-area or inter-AS data path, depending on the locations of the source and destination nodes.
Area border devices, such as area border routers (ABR), are located at the logical border of two or more routing areas. Accordingly, each ABR device participates in multiple routing areas and typically maintains a separate set of routing and topology information for each adjacent routing area in which it participates. Each network node in a routing area typically maintains its own link-state database (LSDB). The LSDB is configured to store topology information advertised with the node's routing area. Because an ABR (by definition) participates in multiple routing areas, each ABR therefore maintains a separate LSDB for each of its routing areas.
Network nodes located in the same routing area generally exchange routing information and network-topology information using an “interior gateway” routing protocol (IGP), such as a link-state protocol. An example of a conventional link-state protocol is the Open Shortest Path First (OSPF) protocol, which is described in more detail in Request for Comments (RFC) 2328, entitled OSPF Version 2, dated April 1998, which is publicly available through the Internet Engineering Task Force (IETF) and is hereby incorporated by reference in its entirety.
OSPF employs conventional link-state advertisements (LSA) for exchanging routing and topology information between a set of interconnected intermediate network nodes, i.e., routers and switches. In fact, different types of LSAs may be used to communicate the routing and topology information. For example, the OSPF version 2 specification (RFC 2328) defines the following types of LSAs: Router, Network, Summary and AS-External LSAs. Router and Network LSAs are used to propagate link information within a routing area. Specifically, Router LSAs advertise router-interface links (i.e., links connected to routers) and their associated cost values, whereas Network LSAs advertise network-interface links (i.e., links connected to subnetworks) and their associated cost values within the routing area.
Summary and AS-External LSAs are used to disseminate routing information between routing areas. The Summary LSA is typically generated by an ABR and is used to advertise intra-AS (“internal”) routes between routing areas. First, the ABR receives various LSAs that are advertised in a first routing area. The ABR “summarizes” the advertised routes by aggregating routes where possible. Next, the ABR stores the summarized routes in a Summary LSA, which it then advertises in a second routing area. In this way, nodes in the second area are made aware of routes in the first routing area that can be reached through the ABR. An AS-External LSA stores a list of reachable inter-AS (“external”) routes, i.e., located outside of the AS. The AS-External LSA is typically generated by an ASBR and is propagated throughout the AS to identify which external routes can be reached through the advertising ASBR. Unlike Summary LSAs, routes stored in an AS-External LSA are generally not aggregated.
Opaque LSAs provide an extensible LSA format for use with the OSPF protocol and are generally described in more detail in the IETF publication RFC 2370, entitled The OSPF Opaque LSA Option, published July 1998, by R. Coltun, which publication is hereby incorporated by reference as though fully set forth herein. As described in RFC 2370, opaque LSAs may be advertised (“flooded”) between network nodes (link-scope), within a routing area (area-scope) or throughout an AS (AS-scope). While the conventional Router, Network, Summary and AS-External LSAs are constrained by their respective formats set forth in the OSPF protocol specification (RFC 2328), opaque LSAs are generally more flexible in what information they can transport. For instance, an opaque LSA may be configured to store one or more type-length-value (TLV) tuples containing selected OSPF attributes associated with routes advertised in the opaque LSA.
The Internet Draft publication <draft-mirtorabi-ospf-tag-01.txt>, entitled Extensions to OSPFv2 for Advertising Optional Route/Link Attributes, published August 2005 by S. Mirtorabi et al., which publication is publicly available through the IETF and is hereby incorporated by reference in its entirety, describes an OSPF Router Attributes (RA) Opaque LSA that may be used to transport at least one Inter-Area/External Route Attribute TLV (RA-TLV). The RA-TLV may contain one or more route attributes that are encoded as sub-TLVs within the RA-TLV. Currently, the RA-TLV is only used to transport sub-TLVs containing OSPF tags, extended tags and multi-topology identifiers associated with OSPF routes advertised in the RA-Opaque LSA.
PE-CE Network Topology
A virtual private network (VPN) is a collection of network nodes that establish private communications over a shared backbone network. Previously, VPNs were implemented by embedding private leased lines in the shared network. The leased lines (i.e., communication links) were reserved only for network traffic among those network nodes participating in the VPN. Today, the above-described VPN implementation has been mostly replaced by private “virtual circuits” deployed in public networks. Specifically, each virtual circuit defines a logical end-to-end data path between a pair of network nodes participating in the VPN.
Network nodes belonging to the same VPN may be situated in different subnetworks, or “customer sites.” Each customer site may participate in one or more different VPNs, although most often each customer site is associated with a single VPN, and hereinafter the illustrative embodiments will assume a one-to-one correspondence between customer sites and VPNs. For example, customer sites owned or managed by a common administrative entity, such as a corporate enterprise, may be statically assigned to the enterprise's VPN. As such, network nodes situated in the enterprise's various customer sites participate in the same VPN and are therefore permitted to securely communicate with one another.
The customer sites typically communicate with one another through a service provider network (“provider network”). The provider network is an AS that functions as a backbone network through which VPN information may be exchanged between customer sites. The provider network may include both provider edge (PE) devices which function as ASBRs at the logical outer edge of the provider network, as well as provider (P) devices situated within the interior (“core”) of the provider network. Accordingly, each customer site contains at least one customer edge (CE) device coupled to a PE device in the provider network. The customer site may be multi-homed to the provider network, i.e., wherein one or more of the customer's CE devices is coupled to a plurality of PE devices. The PE-CE data links may be established over various physical mediums, such as conventional wire links, optical links, wireless links, etc., and may communicate data formatted using various network communication protocols including ATM, Frame Relay, Ethernet, Fibre Distributed Data Interface (FDDI), etc.
In a popular VPN deployment, provider networks often provide the customer sites with layer-3 network-based VPN services that utilize IP and/or Multi-Protocol Label Switching (MPLS) technologies. These networks are typically said to provide “MPLS/VPN” services. This widely-deployed MPLS/VPN architecture is generally described in more detail in the IETF publication RFC 2547, entitled BGP/MPLS VPNs, by E. Rosen et al., published March 1999, which is hereby incorporated by reference as though fully set forth herein.
Most typically, PE and CE devices are configured to exchange routing information over their respective PE-CE data links in accordance with the Border Gateway Protocol (BGP). The BGP protocol is well known and described in detail in RFC 1771 by Y. Rekhter and T. Li, entitled A Border Gateway Protocol 4 (BGP-4), dated March 1995, which publication is hereby incorporated by reference as though fully set forth herein. A variation of the BGP protocol, known as internal BGP (iBGP), is often used to distribute routing and reachability information between PE devices in the provider network. To implement iBGP, the PE devices must be “fully meshed,” such that each PE device is coupled to every other PE device, e.g., by way of a Transmission Control Protocol (TCP) connection. Those skilled in the art will understand that the fully-meshed PE devices may be directly connected or may be otherwise coupled, e.g., by one or more conventional BGP route reflectors.
BGP-enabled PE and CE devices perform various routing functions, including transmitting and receiving BGP messages and rendering routing decisions based on BGP routing policies. Each BGP-enabled device maintains a local BGP routing table that lists feasible routes to reachable (i.e., accessible) network nodes and subnetworks. The BGP table also may associate one or more BGP attributes with each route that it stores. For example, a conventional BGP AS-path attribute may be associated with a BGP route so as to identify a particular AS path that may be used for reaching that route. Typically, the AS path is represented as an ordered sequence of AS numbers corresponding to which ASes must be traversed in order to reach the route's associated node or subnetwork.
Although BGP is most often executed over PE-CE data links, other protocols also may be used to exchange routing and topology information between a customer-site CE device and a provider-network PE device. For instance, the Internet Draft publication <draft-ietf-13vpn-ospf-2547-05.txt>, entitled OSPF as the Provider/Customer Edge Protocol for BGP/MPLS IP VPNs, published November 2005 by Rosen et al., which publication is publicly available through the IETF and is hereby incorporated by reference in its entirety, describes an implementation in which OSPF is executed over a PE-CE link. In this case, the PE device functions as an ABR for the customer site containing the CE device, and thus the PE device maintains both an OSPF LSDB containing the customer site's IGP topology information as well as a BGP table containing BGP routes that have been distributed, e.g., via iBGP, within the provider network.
Routing Loops
Routing protocols, such as OSPF and BGP, typically perform “best path” computations for selecting a preferred data path for transporting data to a destination node or subnetwork. Therefore, it is possible that two or more networks or subnetworks may select each other as the best path to reach a certain destination. In such a scenario, a “routing loop” can develop where data addressed to that destination is circulated among the two or more networks or subnetworks and may never actually reach its intended recipient. An example of a conventional routing loop is illustrated in FIG. 1.
FIG. 1 illustrates an exemplary network 100 including a provider network AS1 110 coupled to two customer sites 120 and 130 (labeled “A” and “B,” respectively). Here, the customer sites A and B participate in the same VPN, e.g., VPN, and therefore communicate with one another through the provider network 110. As shown, the customer site 120 includes CE devices 125a and 125b (CE1 and CE2) which are coupled to respective PE devices 115a and 115b (PE1 and PE2) in the provider network. In addition, the customer site 130 includes a CE device 135c (CE3) which is coupled to a PE device 115c (PE3) in the provider network.
Suppose that CE3 advertises a message over the PE3-CE3 data link indicating that CE3 can reach the destination prefix “X.” The advertised prefix is received by PE3, which in turn distributes the advertised prefix, e.g., in an iBGP update message, to the devices PE1 and PE2 in the provider network. After PE1 receives the iBGP advertisement, PE1 advertises the prefix X over the PE1-CE1 data link, thereby signaling to nodes in the customer site 120 that the prefix X can be reached via PE1. The prefix X is then distributed within the customer site 120 using an appropriate IGP protocol. CE2 may advertise over the PE2-CE2 data link that it can reach the prefix X. In response to receiving CE2's advertisement, PE2 may distribute this reachability information to the provider-edge devices PE1 and PE3. Although FIG. 1 illustrates the prefix X being advertised along the sequential data flow CE3-PE3-PE1-CE1-CE2-PE2-PE1, the prefix also may be advertised along a similar loop (not shown) CE3-PE3-PE2-CE2-CE1-PE1-PE2.
As a result of the above-noted advertisements, network nodes in the customer site 120 become aware that the prefix X is reachable through PE1, and PE1 becomes aware that the prefix X can be reached via PE2 or PE3. In this case, a routing loop may develop if the best-path calculations performed at PE1 determine that data addressed to the destination prefix X should be routed to the customer site 120 via PE2, instead of correctly routing the data through PE3 to the customer site 130. Thus, the data addressed to the prefix X may be passed back and forth between AS1 and the customer site 120, e.g., around the routing loop CE1-PE1-PE2-CE2-CE1.
One solution for preventing routing loops where OSPF is executed over the PE-CE data links is described in the above-incorporated Internet Draft publication <draft-ietf-13vpn-ospf-2547-05.txt>, entitled OSPF as the Provider/Customer Edge Protocol for BGP/MPLS IP VPNs. This proposed solution relies on an OSPF route tag for identifying when one or more advertised routes have already been advertised from a PE device to a CE device. Accordingly, when an LSA containing an advertised route and a corresponding OSPF route tag is received at a PE device, that PE device can identify the route tag and determine that the received route was previously advertised by a PE device in the provider network. Based on this determination, the PE device can conclude that the received route should not be propagated again through the provider network.
This known OSPF route-tag solution suffers the disadvantage that it is not applicable when Summary LSAs are exchanged over PE-CE data links. More specifically, Summary LSAs are not formatted in a manner that enables them to transport the OSPF route tags. Instead, the route tags are typically transported in AS-External LSAs which carry external routing information. Thus, this solution is generally undesirable since it precludes the use of Summary LSAs for advertising internal routes over PE-CE links and therefore does not permit conventional route aggregation techniques that are traditionally employed for reducing the number of routes processed in an OSPF routing area. Also, as will be understood by those skilled in the art, the OSPF external route tag solution does not apply to multi-homed networks.
Yet another solution for preventing routing loops where OSPF is executed over the PE-CE data links is described in the Internet Draft publication <draft-ietf-ospf-2547-dnbit-04.txt>, entitled Using an LSA Options Bit to Prevent Looping in BGP/MPLS IP VPNs, published March 2004 by Rosen et al., which publication is publicly available through the IETF and is hereby incorporated by reference as though fully set forth herein. This solution proposes using the most-significant bit, i.e., the “DN” bit, in the conventional LSA-options field to indicate when an OSPF LSA has been advertised from a PE device to a CE device. Because every LSA transports the LSA-options field, this DN-bit solution is not limited to only AS-External LSAs. When a PE device receives an LSA whose DN bit is “set,” the routing information transported in the received LSA is excluded from the PE device's SPF calculation (e.g., the LSAs are not stored in the OSPF LSDB). As such, the LSA's advertised routes are not installed in the PE device's routing table. In this way, the uninstalled routes are not redistributed into the provider network's BGP tables, thereby ensuring that routing loops cannot develop between the provider network and the customer site containing the CE device.
FIG. 2 illustrates the exemplary network 100 in which the DN-bit solution is deployed for preventing routing loops. First, CE3 advertises the prefix X over the PE3-CE3 data link. The advertised prefix X is received by PE3, which in turn advertises the prefix, e.g., in an iBGP update message, to the devices PE1 and PE2. After PE1 receives the iBGP advertisement, PE1 advertises a conventional OSPF LSA containing the prefix X over the PE1-CE1 data link. However, according to this DN-bit solution, PE1 sets the DN-bit in the advertised LSA to indicate that the prefix X is reachable through the provider network. The LSA, with its DN bit set, is distributed throughout the customer site 120. CE2 may forward the LSA back to the provider network 110 over the PE2-CE2 link. However, because the DN-bit is set in the LSA, PE2 can determine that the LSA was generated by another PE device (PE1) in the provider network. Upon making this determination, PE2 does not install the prefix X in its routing table, thereby preventing any potential routing loops from developing between the customer site 120 and the provider network 110. That is, network nodes in customer site 120 are aware that prefix X can be reached via PE1, and PE1 is only aware that prefix X can be reached via PE3.
Although this conventional DN-bit solution for preventing routing loops works well in many network topologies, it may suffer various problems in topologies having multiple provider networks (ASes) that are not configured to directly communicate with one another, e.g., because of contractual terms or lack of network connectivity. For instance, consider the exemplary network 300 shown in FIG. 3. Here, the provider network AS1 310 is coupled to the customer sites 330, 340 and 350 (labeled “A,” “B” and “C” respectively) which participate in the same VPN, e.g., VPN1. In addition, the customer sites 340 and 350 are also coupled to a second provider network AS2 320. In this case, the provider network AS1 may function as a primary Internet service provider (ISP), whereas the provider network AS2 functions as a backup ISP through which the customer sites 330-350 may communicate in the event that a PE-CE link to AS1 fails. Notably, AS1 and AS2 are not configured to communicate directly with one another.
As shown, the customer site 330 includes a CE device 335a (CE1) which is coupled to a PE device 315a (PE1) located in AS1. In addition, AS1 also includes a PE device 315b (PE2) coupled to a CE device 345b (CE2) situated in the customer site 340, as well as to a PE device 315c (PE3) coupled to a CE device 355c (CE3) located in the customer site 350. Also, a CE device 345d (CE4) in the customer site 340 is coupled to a PE device 325d (PE4) in AS2, and a CE device 355e (CE5) in the customer site 350 is coupled to a PE device 325e (PE5) in AS2. Further assume that each of the PE-CE data links is configured to execute OSPF.
In this illustrative topology, CE1 may advertise to PE1 that the prefix X can be reached via CEC. In response, PE1 propagates the prefix X, e.g., in iBGP update messages, to the devices PE2 and PE3. Next, PE2 and PE3 each may advertise an LSA containing the prefix X to the customer-edge devices CE2 and CE3. In accordance with the conventional DN-bit technique, the LSAs advertised over the PE2-CE2 and PE3-CE3 data links have their DN bits set to a predetermined value so as to indicate that the prefix X is reachable through a PE device. When the LSAs are forwarded over the PE4-CE4 and PE5-CE5 data links, the provider-edge devices PE4 and PE5 notice that the DN bits are set in the received LSAs and, consequently, exclude the prefix X from their OSPF and BGP routing tables. As a result, the customer sites 340 and 350 do not learn that the prefix X can be reached via the ISP AS2. In other words, the backup connectivity of AS2 is “broken” for the backup ISP AS2, since the customer sites 340 and 350 are only made aware that the prefix X can be reached through the primary ISP AS1.
For example, in the event that the PE2-CE2 link fails, the customer site 340 is not aware that the prefix X can alternatively be reached through AS2, e.g., via the backup data path CE4-PE4-PE5-CE5-CE3-PE3-PE1-CE1. Similarly, if the PE3-CE3 data link fails, the customer site 350 is not aware that the prefix X can be reached through AS2, e.g., via the backup data path CE5-PE5-PE4-CE4-CE2-PE2-PE1-CE1.
In networks having multiple provider networks that are not configured to communicate with one another, as shown in FIG. 3, it is generally desirable to implement a routing-loop prevention technique that does not break the backup connectivity of the topology. The technique should not be limited to AS-External LSAs sent over PE-CE links and instead should be operable with any type of OSPF LSA sent over a PE-CE data link.