A computer network is a geographically distributed collection of interconnected communication links for transporting data between nodes, such as computers. Many types of computer networks are available, with the types ranging from local area net-works (LANs) to wide area networks (WANs). The nodes typically communicate by exchanging discrete frames or packets of data according to predefined protocols. In this context, a protocol consists of a set of rules defining how the nodes interact with each other.
Computer networks may be further interconnected by an intermediate node, called a router, to extend the effective “size” of each network. Since management of a large system of interconnect computer networks can prove burdensome, smaller groups of computer networks may be maintained as routing domains or autonomous systems. The networks within an autonomous system are typically coupled together by conventional intradomain routers. These routers manage communication among local networks within their domains and communicate with each other using an intradomain routing (or an interior gateway) protocol. An example of such a protocol is the Open Shortest Path First (OSPF) routing protocol. The OSPF protocol is based on link-state technology and, therefore, is hereinafter referred to as a link state routing protocol.
Each router running the link state routing protocol maintains an identical link state database (LSDB) describing the topology of the autonomous system (AS). Each individual piece of the LSDB is a particular router's local state, e.g., the router's usable interfaces and reachable neighbors or adjacencies. As used herein, neighboring routers (or “neighbors”) are two routers that have interfaces to a common network, wherein an interface is a connection between a router and one of its attached networks. Moreover, an adjacency is a relationship formed between selected neighboring routers for the purpose of exchanging routing information and abstracting the network topology. One or more router adjacencies may be established over an interface. Each router distributes its local state throughout the domain in accordance with an initial LSDB synchronization process and a conventional flooding algorithm.
In order to guarantee convergence of a link state routing protocol, it should be ensured that link state protocol data units (PDUs) that originate after an initial LSDB synchronization between neighbors is completed are delivered to all routers within the flooding scope limits. These limits may comprise an area or the entire AS, depending on the protocol and the type of link-state PDU. An area is a collection or group of contiguous networks and nodes (hosts), together with routers having interfaces to any of the included networks. Each area runs a separate copy of the link state routing algorithm and, thus, has its own LSDB. In the case of OSPF, the PDU is a link state advertisement (LSA) comprising a unit of data describing the local state of a router or network. The collected PDUs of all routers and networks form the LSDB for the particular link state routing protocol.
The infrastructure of a typical router comprises functional components organized as a control plane and a data plane. The control plane includes the functional components needed to manage the traffic forwarding features of the router. These features include routing protocols, configuration information and other similar functions that determine the destinations of data packets based on information other than that contained within the packets. The data plane, on the other hand, includes functional components needed to perform forwarding operations for the packets.
For a single processor router, the control and data planes are typically implemented within the single processor. However, for some high performance routers, these planes are implemented within separate devices of the intermediate node. For example, the control plane may be implemented in a supervisor processor, such as a route processor, whereas the data plane may be implemented within a hardware-assist device, such as a co-processor or a forwarding processor. In other words, the data plane is typically implemented in a specialized piece of hardware that is separate from the hardware that implements the control plane.
The control plane generally tends to be more complex than the data plane in terms of the quality and quantity of software operating on the supervisor processor. Therefore, failures are more likely to occur in the supervisor processor when executing such complicated code. In order to ensure high availability in an intermediate network node, it is desirable to configure the node such that if a failure arises with the control plane that requires restarting and reloading of software executing on the supervisor processor, the data plane continues to operate correctly. Restarting and reloading of control plane software may be necessary because of a failure with the routing protocol process, e.g., an OSPF module, or a software upgrade to the OSPF module. A router that is configured to enable its data plane to continue packet forwarding operations during restart and reload of the control plane software is referred to as a non-stop forwarding (NSF) capable router.
As noted, the OSPF routing protocol creates adjacencies between neighboring routers for the purpose of exchanging routing information. These adjacencies are established and maintained through the use of a conventional Hello protocol. The Hello protocol is described in Request for Comments (RFC) 2328, OSPF Version 2, by J. Moy (1998). Broadly stated, the Hello protocol ensures that communication between neighbors is bi-directional by periodically sending Hello packets out all router interfaces. Bi-directional communication is indicated when the router “sees” itself listed in the neighbor's Hello packet. On broadcast and non-broadcast multi-access (NBMA) net-works, the Hello protocol elects a designated router (DR) for the network.
For conventional link-state routing protocols, the periodically sent Hello packets are used to ensure reachability of the neighboring routers. If no Hello packet has been received from a router within a predefined period of time, e.g., a RouterDeadInterval in OSPF (40 seconds by default), its neighbors declare the router to be unreachable and stop forwarding traffic to it. The RouterDeadInterval (“inactivity timer”) is used for maintaining adjacencies with neighboring routers, whereas a Hello interval is the period of time between which Hello packets are sent. In addition, the neighbors inform other routers about the unreachable router and, thus, the change to the network topology. Therefore, a control plane restart is “visible” to neighboring routers as a topology change in the network that requires those neighbors having “knowledge” of the network to recompute their routing databases and route around the unreachable router.
Assume the OSPF routing protocol module is reloaded in a NSF-capable router. During the reload process, the router stops sending Hello packets out of its interfaces, which results in “dropping” of the existing adjacencies with its neighboring routers. As a result, the NSF router does not receive any data traffic from the routers on the network even though it may be able to forward traffic to the network. Consequently, for a NSF-capable router to still forward transit (packet) traffic, it must send its first Hello packet (i.e., the Hello interval) after the routing software reload within the inactivity timer interval. This is typically not an issue since most modern routers can reload or activate a secondary control board (or new OSPF module) within the inactivity timer interval. If the reload process takes more than 40 seconds, the Hello and inactivity timer intervals may be configured with a greater value.
A problem with the Hello protocol is that when a router reloads, it does not know its neighbors until it receives at least one Hello packet from each of them. All Hello packets include a list of neighbors that the sending router has “heard from”. If the router sends a Hello packet not listing its neighbors, the neighbors drop the adjacencies because a 2-way connectivity condition has not been met. In other words, the reloading router cannot send Hello packets over its interfaces until those preexisting neighbor adjacencies are known to it. Otherwise, the Hello packets are sent without the router ID of the neighboring router, causing the neighbor to reset its adjacency with the router and eventually stop forwarding traffic to it. A goal of a NSF-capable router is to continue forwarding traffic during restart/reload of control plane software, such as the OSPF routing protocol software, so that the reload is transparent to the router's neighbors. Accordingly, the present invention provides a backward-compatible technique that allows the router to identify its neighbors after reload of routing protocol software to thereby maintain (i.e., “keep up”) its adjacencies with its neighboring routers.