A computer network is a collection of interconnected computing devices that exchange data and share resources. In a packet-based network, such as the Internet, computing devices communicate data by dividing the data into small blocks called packets, which are individually routed across the network from a source device to a destination device. The destination device extracts the data from the packets and assembles the data into its original form. Dividing the data into packets enables the source device to resend only those individual packets that may be lost during transmission.
Certain devices within the network, such as routers, maintain routing information that describes routes through the network. Each route defines a path between two locations on the network. From the routing information, the routers may generate forwarding information, which is used by the routers to relay packet flows through the network and, more particularly, to relay the packet flows to a next hop. In reference to forwarding a packet, the “next hop” from a network router typically refers to a downstream neighboring device along a given route. Upon receiving an incoming packet, the router examines information within the packet to identify the destination for the packet. Based on the destination, the router forwards the packet in accordance with the forwarding information.
Some computer networks, such as the Internet, an administrative domain or network, often include many routers that exchange routing information according to a defined routing protocol. Examples of the defined routing protocol may include, among others, the Border Gateway Protocol (BGP), the Intermediate System to Intermediate System (IS-IS) Protocol, and the Open Shortest Path First (OSPF) Protocol. When two routers initially connect, the routers exchange routing information and generate forwarding information from the exchanged routing information. Particularly, the two routers initiate a routing communication “session” via which they exchange routing information according to the defined routing protocol. The routers continue to communicate via the routing protocol to incrementally update the routing information and, in turn, update their forwarding information in accordance with changes to a topology of the network indicated in the updated routing information. For example, the routers may send update messages to advertise newly available routes or to inform that some routes are no longer available.
A computer network utilizing BGP directs data packets between network nodes based on addressing information with the data packets. A BGP network may include one or more routers, nodes, and end point devices (e.g., servers, printers, and computers). Some of the routers within the BGP network may be grouped together into redundant clusters. Each router within the BGP network typically forwards packets according to routes stored at the router and the destination address of the data packets.
In the event of a routing communication session failure from a failed router, i.e., the session faults or “goes down,” a surviving router may select one or more alternative routes through the computer network to avoid the failed router and continue forwarding packet flows. In particular, the surviving router may update internal routing information to reflect the failure, perform route resolution based on the updated routing information to select one or more alternative routes, update its forwarding information based on the selected routes, and send one or more update messages to inform peer routers of the routes that are no longer available. In turn, the receiving routers update their routing and forwarding information, and send update messages to their peers. This process continues and the update information may propagate outward until it reaches all of the routers within the network. Routing information in large networks may take a long period of time to converge to a stable state after a network fault due to temporary oscillations, i.e., changes that occur within the routing information until it converges to reflect the current network topology. These oscillations within the routing information are often referred to as “flaps,” and can cause significant problems, including intermittent loss of network connectivity, increased packet loss, and latency.
As one technique for reducing the impact of failures, some routers support “graceful restart,” which refers to the capability of preserving forwarding information while restarting a routing communication session with a peer router that may have failed. When establishing a routing communication session, a router that supports graceful restart may advertise the capability to the peer router and may specify a restart time. The restart time is the estimated time for the router to reestablish the routing communication session after failure of the previous session. Upon failure of the routing communication session, the surviving router preserves any forwarding information currently in its forwarding plane from a failed router based on the expectation that the failed router will shortly reestablish the routing communication session. In other words, the surviving router will maintain the failed router within a forwarding path of the surviving router for a “grace period” in the event of a failure of the routing communication session. During the grace period, the failed router preserves forwarding information in a state that existed prior to the failure and may relearn the network topology and recalculate its routing information and forwarding information. Consequently, the surviving router does not need to find alternative routes unless the failed router does not reestablish the routing communication session within the advertised restart time. Moreover, the surviving router does not propagate a change in the state of the failed router to the network during a graceful restart interval. As a result, the routing instability caused by routing flaps within the network may be reduced.