Data communication in a computer network involves the exchange of data between two or more entities interconnected by communication links and subnetworks (subnets). These entities are typically software programs executing on hardware computer platforms, such as end nodes and intermediate network nodes. The intermediate network nodes interconnect the communication links and subnets to enable transmission of data between the end nodes, such as personal computers or workstations. A local area network (LAN) is an example of a subnet that provides relatively short distance communication among the interconnected nodes, whereas a wide area network (WAN) enables long distance communication over links provided by public or private telecommunications facilities. The Internet is an example of a WAN that connects disparate computer networks throughout the world, providing global communication between nodes on various networks.
Communication software executing on the nodes correlate and manage data communication with other nodes. The nodes typically communicate by exchanging discrete messages or packets of data according to predefined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP). In this context, a protocol consists of a set of rules defining how the nodes interact with each other. In addition, network routing software executing on the intermediate nodes allow expansion of communication to other nodes. Collectively, these hardware and software components comprise a collection of computer networks.
Since management of computer networks can prove burdensome, smaller groups of one or more computer networks can be maintained as separate routing domains or autonomous systems (ASes). In this context, a routing domain is broadly construed as a collection of interconnected nodes within a common address space (e.g., a level, area or AS), and an AS is a routing domain managed by a single administrative entity, such as a company, an academic institution or a branch of government. To interconnect dispersed networks and/or provide Internet connectivity, many organizations rely on the infrastructure and facilities of Internet Service Providers (ISPs). An ISP is an example of an AS that typically owns one or more “backbone” networks configured to provide high-speed connection to the Internet. To interconnect private routing domains that are geographically diverse, an organization (customer) may subscribe to one or more ISPs and couple its private domain networks to the ISP's equipment. Here, an intermediate network node, such as a switch or router, may be utilized to interconnect a plurality of private networks to an IP backbone network.
ISP backbone networks generally require fast convergence in order to provide a reliable service to its customers. Convergence, in this context, denotes the ability of a router or network to react to failures or, more generally, to network events and to recover from those failures in order to have minimal disruption time. Examples of such failures include link or node failures. Fast convergence thus involves the ability of the ISP backbone networks to react very quickly to such link and node failures to thereby reroute traffic over alternate paths and, thus, minimize service disruption.
A main component of fast convergence in a router is a routing information base (RIB). The RIB is a process that manages a routing table that holds many (e.g., thousands) of routes computed by different protocols, including both interior gateway protocols (IGP) and exterior gateway protocols (EGP). IGP protocols, such as conventional link-state protocols, are intra-domain routing protocols that define the manner with which routing information and network-topology information are exchanged and processed in a routing domain, such as an ISP backbone network. Examples of conventional link-state protocols include, but are not limited to, the Open Shortest Path First (OSPF) protocol and the Intermediate-System-to-Intermediate-System (ISIS) protocol. The OSPF protocol is described in more detail in Request for Comments (RFC) 2328, entitled OSPF Version 2, dated April 1998, which is incorporated herein by reference in its entirety. The ISIS protocol is described in more detail in RFC 1195, entitled Use of OSI IS-IS for Routing in TCP/IP and Dual Environments, dated December 1990, which is incorporated herein by reference in its entirety.
Each router running a link-state protocol (i.e., IGP) maintains an identical link-state database (LSDB) describing the topology of the routing domain. Each piece of the LSDB is a particular router's local state, e.g., the router's usable interfaces and reachable neighbors or adjacencies. As used herein, neighboring routers (or “neighbors”) are two routers that have interfaces to a common network, wherein an interface is a connection between a router and one of its attached networks. Moreover, an adjacency is a relationship formed between selected neighbors for the purpose of exchanging routing information and abstracting the network topology. One or more router adjacencies may be established over an interface. Each router distributes its local state throughout the domain in accordance with an initial LSDB synchronization process and a conventional flooding algorithm.
In order to guarantee convergence of a link-state protocol, link-state protocol data units (PDUs) that originate after an initial LSDB synchronization between neighbors is completed are delivered to all routers within the flooding scope limits. The PDUs are used to exchange routing information between interconnected routers. The flooding scope limits may comprise an area, a level or the entire AS, depending on the protocol and the type of link-state PDU. An area or level is a collection or group of contiguous networks and nodes (hosts), together with routers having interfaces to any of the included networks. Each area/level runs a separate copy of the link-state routing algorithm and, thus, has its own LSDB. In the case of OSPF, the PDU is a link state advertisement (LSA) comprising a unit of data describing the local state of a router or network, whereas in the case of ISIS, the PDU is a link state packet (LSP). As used herein, a LSA generally describes any message used by an IGP process to communicate routing information among the nodes, such that the collected LSAs of all routers and networks form the LSDB for the particular link-state protocol.
Broadly stated, the IGP process executing in a sending router typically generates and disseminates a LSA whose routing information includes a list of the node's neighbors and one or more “cost” values associated with each neighbor. A cost value associated with a neighbor is an arbitrary metric used to determine the relative ease/burden of communicating with that router. For instance, the cost value may be measured in terms of the number of hops required to reach the neighbor, the average time for a packet to reach the neighbor, and/or the amount of network traffic or available bandwidth over a communication link coupled to the neighbor.
LSAs are typically transmitted (“advertised”) among the routers until each router can construct the same “view” of the network topology by aggregating the received lists of neighbors and cost values. The IGP process advertises routes internal to the routing domain (“internal routes”) via LSAs that typically comprise the routers' loopback addresses as well as interface/link addresses. A loopback address is a type of “virtual” interface identifier of the router that is stable and always available (does not fail) and, as such, is advertised instead of a physical interface address to ensure that the router can always reach its neighbor. Each router may input this received routing information to a “shortest path first” (SPF) calculation that determines the lowest-cost network paths that couple the router with each of the other network nodes. The well-known Dijkstra algorithm is a conventional technique for performing such a SPF calculation, as described in more detail in Section 12.2.4 of the text book Interconnections Second Edition, by Radia Perlman, published September 1999.
The routers typically have a topology table that contains all destinations advertised by neighbors. Each entry in the topology table includes the destination address and a list of neighbors that have advertised the destination. For each neighbor, the entry records the advertised metric, which the neighbor stores in its routing table. The metric that the router uses to reach the destination is also associated with the destination. The metric that the router uses in the routing table, and to advertise to other routers, is the sum of the best-advertised metric from all neighbors and the link cost to the best neighbor. An example of a topology table is the LSDB having a map of every router, its links and the states of those links in the routing domain. The LSDB also has a map of every network and every path to each network in the routing domain.
Specifically, the LSA is processed by the IGP process of a receiving router and provided to the RIB so that it can process the advertisement (along with other routing information) to determine best paths for purposes of populating a forwarding table of a forwarding information base (FIB). In a link state protocol, such as ISIS and OSPF, the router that is directly affected by a failure (i.e., closest to the failure) advertises such failure via the LSA to the rest of the network. In response, each router in the network computes a new network topology and, thus, a new path around the failure. To achieve fast convergence, the IGP process of each router re-computes its topology table and updates the routing table to reflect the topology change. More specifically, the SPF calculation is applied to the contents of the LSDB to compute a shortest path to each destination network. To that end, the algorithm prunes the database of alternate paths and creates a loop-free shortest path tree (SPT) of the topological routing domain. The routing table is then updated to correlate destination nodes with network interfaces associated with the lowest-cost paths to reach those nodes, as determined by the SPF calculation.
A plurality of interconnected ASes may be configured to exchange messages in accordance with an EGP, such as the Border Gateway Protocol version 4 (BGP). To implement the BGP protocol, each routing domain (e.g., AS) includes at least one “border” router through which it communicates with other, interconnected ASes. Before transmitting such messages, however, the routers cooperate to establish a logical “peer” connection (session). BGP is an inter-domain routing protocol that generally operates over a reliable transport protocol, such as TCP, to establish a TCP connection/session; any two border routers that have opened a TCP connection (session) to each other for the purpose of exchanging routing information are known as peers or neighbors. BGP performs routing between ASes by exchanging routing (reachability) information among neighbors of the systems.
The routing information exchanged by BGP neighbors typically includes destination address prefixes, i.e., the portions of destination addresses used by the routing protocol to render routing (“next hop”) decisions, and associated path attributes. Examples of such destination addresses include Internet Protocol (IP) version 4 (IPv4) and version 6 (IPv6) addresses, while an example of a path attribute is a next-hop address. Note that the combination of a set of path attributes and a prefix is referred to as a “route”; the terms “route” and “path” may be used interchangeably herein. The BGP routing protocol is well known and described in detail in Request For Comments (RFC) 1771, by Y. Rekhter and T. Li (1995), Internet Draft <draft-ietf-idr-bgp4-20.txt> titled, A Border Gateway Protocol 4 (BGP-4) by Y. Rekhter and T. Li (April 2003) and Interconnections, Bridges and Routers, by R. Perlman, published by Addison Wesley Publishing Company, at pages 323-329 (1992), all disclosures of which are hereby incorporated by reference.
Two BGP-enabled routers (i.e., BGP speakers) that are not in the same AS use external BGP (eBGP) to exchange routes. Internal BGP (iBGP) is a form of BGP that exchanges routes among iBGP neighbors within an AS. BGP speakers within an AS are typically connected via a fully meshed iBGP session arrangement to ensure that all BGP speakers receive route updates from the other BGP speakers in the AS. When a BGP speaker receives updates from multiple ASes that describe different paths to the same destination, the speaker chooses a single best path for reaching that destination (prefix). Once chosen, the speaker uses BGP to propagate that best path to its neighbors. The decision is based on the value of attributes, such as next-hop, contained in a BGP update message and other BGP-configurable factors. In this context, the BGP next-hop attribute is the network (IP) address of the next hop (neighbor) used to reach the destination prefix.
More specifically, each route advertised by BGP must have a next hop address that is reachable through IGP in order for that route to be considered valid. That is, a valid BGP route must contain an attribute (such as a BGP next-hop address) that, in turn, must exist in the routing table of the router through IGP. Both BGP and IGP (OSPF, ISIS) provide routes (best paths per prefixes) to the RIB; however, among the prefixes provided by IGP that the RIB installs into the routing table are those prefixes that are used as BGP next hop addresses. These BGP next hop addresses are illustratively loopback addresses of the BGP next-hop routers.
As noted, ISP backbone networks require fast convergence in order to provide a reliable service to its customers. Convergence occurs when all of the routers have a consistent perspective (“view”) of the network topology. After a topology change, e.g., one or more link and/or node failures, the routers re-compute their best paths; this typically disrupts the service provided by the ISP. The ISP backbone networks must therefore be able to react quickly to such failures in order to re-route traffic over alternate paths and, thus, minimize service disruption. However, not all routes require fast convergence.
Typically the routes (addresses) used as BGP next-hop attributes within BGP update messages are considered most important addresses because they enable connectivity inside and outside of the routing domain. For example, these next-hop addresses are typically addresses of subnets used to connect servers/gateways; as such, they are considered most important because BGP relies on them for external activity, i.e., activity external to the routing domain. Yet, the addresses of subnets used to connect servers and gateways could also be part of an internal routing domain. Here, the routers may connect voice over IP (VoIP) servers, such that all IP telephony of the routing domain relies on those servers. Therefore it is desirable to prioritize these next hop addresses to enable fast convergence.