A computer network is a collection of interconnected computing devices that can exchange data and share resources. In a packet-based network, such as the Internet, the computing devices communicate data by dividing the data into small blocks called packets, which are individually routed across the network from a source device to a destination device. The destination device extracts the data from the packets and assembles the data into its original form. Dividing the data into packets enables the source device to resend only those individual packets that may be lost during transmission.
Routers refer to network devices whose primary function is to route packets through the network from the source device to the destination device. Routers are distributed throughout the network, and each router typically maintains routing information that describes available routes through the network. A “route” can be generally defined as a path between two locations on the network. Upon receiving an incoming packet, the router examines information within the packet that indicates the packet's destination, and forwards the packet toward its destination in accordance with the routing information maintained in the router. Available routes within the network may change, however, as network events occur, such as the failure of an existing router or the addition of a new router in the network.
In order to maintain an accurate representation of the network, routers periodically exchange routing information in accordance with a defined protocol, such as the Border Gateway Protocol (BGP). Large computer networks, such as the Internet, often include many routers grouped into administrative domains called “autonomous systems.” When routers of different autonomous systems use BGP to exchange information, the protocol is referred to as External BGP (EBGP). When routers within an autonomous system use BGP to exchange routing information, the protocol is referred to as Internal BGP (IBGP). Another exemplary protocol for exchanging routing information is the Intermediate System to Intermediate System protocol (ISIS), which is an interior gateway routing protocol used by internet protocol (IP)-based networks for communicating link-state information within an autonomous system. Other examples of interior gateway routing protocols include the Open Shortest Path First (OSPF), and the Routing Information Protocol (RIP).
Routers periodically communicate with other routers in order to exchange information indicative of the ever-evolving network topology. When routers initially establish communication, the routers may exchange all of their routing information and update their respective stored information based on the routing information received from other routers. Also, after establishing communication, the routers may send control messages to incrementally update the routing information when the network topology changes. For example, the routers may send update messages to advertise newly available routes, or to withdraw routes that are no longer available.
The routing information is typically maintained in the routers in the form of one or more routing tables or other data structures. The form and contents of the routing tables often depend on the routing algorithm implemented by the router. Furthermore, some routers generate and maintain forwarding information in accordance with the routing information. The forwarding information associates network routes with specific forwarding next hops and corresponding interface ports of the router. The forwarding information, therefore, is based on the information contained within routing information. By maintaining forwarding information, forwarding next hops can be quickly identified, resulting in improved throughput by a given router.
The connection between two devices on a network is generally referred to as a link. Connections between devices of different autonomous systems are referred to as external links while connections between devices within the same autonomous system are referred to as internal links. Many conventional computer networks, including the Internet, are designed to dynamically reroute data packets when an individual link fails. Upon failure of a link, the routers transmit new connectivity information to neighboring devices, allowing each device to update its local routing table. Links can fail for any number of reasons, such as failure of the physical infrastructure between the devices, or failure of the devices interfacing with the link.
According to many routing protocols, when a router detects a link failure, the router broadcasts one or more update messages to inform neighboring routers that certain routes are no longer available and should be removed from local routing tables. The receiving routers update their routing tables based on this information and send update messages to their neighbors. This process repeats itself and the update information propagates from router to router. The form of the update message depends on the type of routing algorithm used.
Due to the size and complexity of the routing information maintained by routers within a large network, such as the Internet, a single change in network topology may require updating of tens of thousands, if not hundreds of thousands, of individual routes. For example, it is not uncommon for a single router to affect the flow of thousands of routes through the system. Accordingly, a single network event, such as the failure of a router, can force routers within the system to update hundreds of thousands of routes, which can consume considerable computing resources and substantially delay rerouting packets.
Routing tables in large networks may take a long period of time to converge to stable routing information after a network fault. One recognized cause of the delay is temporary oscillations, i.e., changes that occur within the routing tables until they converge to reflect the current network topology. These oscillations in routing information, often referred to as “flaps” or “route flaps” can cause significant problems, including intermittent loss of network connectivity as well as increased packet loss and latency.