The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art merely by inclusion in this section.
In some communication networks, routers transfer data packets between edges of a network. A router receives a data packet that indicates a destination for the data packet and forwards the data packet to an adjacent router associated with the destination. Each router maintains a routing database, sometimes called a “routing table” or “routing information base” (RIB). The routing database associates each destination with one or more adjacent routers. In some routing databases, the routing database includes a measure of the cost of using a particular adjacent router to reach a particular destination. The router selects an adjacent router based on the information in the routing database and forwards the data packet to the selected router. The data in the routing database is used to control the transfer of data packets through the router.
As routers join or leave the communication network, the data in the routing database at affected routers is updated. Various protocols are available for maintaining and updating the information in the routing database. For example, known protocols for maintaining and updating routing databases include Open Shortest Path First (OSPF) protocol, an Interior Gateway Routing Protocol (IGRP) and Enhanced Interior Gateway Routing Protocol (EIGRP), among others. The process of joining or leaving a network involves a large number of communications among the routers to determine which routers are used to forward data packets headed for different destinations on the edge of the network, to determine the cost of using that router for that destination, and to update the routing database at each affected router.
Control messages sent among the routers according to one or more of the routing protocols are processed in a control plane processor in the router, and switching of data packets between two interfaces on an individual router is performed in a data plane processor. Changes to the routing database are determined in the control plane and stored in the routing database, which is used to configure the data plane.
A failure can be experienced in the control plane even when there is no failure in the data plane. The failure in the control plane can be caused by a variety of circumstances. For example, a failure in the control plane might be caused by receiving a protocol message that causes the control plane to shut down or erase some or all of the routing database, by a hardware failure in the control plane processor, by a failure in memory storing the routing database, or by a software failure in the instructions executed in the control plane processor, among other causes. There is a need in such circumstances for the data plane to continue forwarding data packets while the control plane is restarted, or repaired or replaced.
In one approach, a second, standby control plane processor is included in each router, so that if an equipment failure occurs in one control plane, the standby control plane can assume control plane duties automatically. According to an aspect of this approach, which is termed a “stable switchover” (SSO) or “non-stop forwarding” (NSF) approach, during the switch of control planes at a router, the data plane of the router continues to forward packets according to the old routing database. During this time, the router is incapable of responding to changes in the network topology, such as changes caused by the addition or removal of a node in the network. After a control plane is restored for the router, the control messages are used to update the database at the router and make any changes to the data plane reflected in the restored routing database.
Whether a second control plane processor replaces a failed control plane processor, or whether a control plane processor that temporarily stopped functioning begins to function again, the routing database available to the control plane is then restored. It is desirable to restore the routing database on the router without consuming resources at routers all over the network that are involved in logically rediscovering the routes and costs that go through the restored router, such as by logically removing the router and then logically adding the router to the network.
In one approach, as used by the OSPF protocol, each router maintains a copy of a complete routing database in the control plane. When a router switches or restarts a control plane processor, one or more of the neighbors to that router in the network sends the complete routing database to the control plane. In this manner, the routers on the network avoid consuming resources to logically rediscover the routes going through that router.
While useful for many purposes, the approach of storing a routing database for the whole network at every node and sending the whole routing database to the router with the new or restarting control plane suffers some disadvantages. One disadvantage is that each router consumes considerable resources to store and update excess routing database information for routers that make no difference to the data packet forwarding that occurs in its own data plane processor. Another disadvantage is that network bandwidth is consumed to send excess information to the router that switches or restarts a control plane processor.
In one approach, the amount of excess information communicated over the bandwidth available to the nodes is reduced by updating the routing databases intermittently, not after each change to any piece of the database. This approach is called “checkpointing.” While checkpointing can reduce the amount of bandwidth consumed, it does have some disadvantages. One disadvantage is that a control plane restarts in a time window after a change is made to a database and before the change is communicated at a scheduled checkpointing event, then the restarting control plane may receive incorrect information that may affect the routes it is using.
According to EIGRP, each router stores a different routing database that includes only routing information used by the data plane on that router. This protocol does not force routers to consume resources for excess information. However, early versions of EIGRP that provide NSF for a router that temporarily loses its control plane processor do not provide techniques to avoid having the control plane instigate a process that consumes resources at nodes across the network to rediscover the routes through the restarting router.
Based on the foregoing, there is a clear need for a version of EIGRP that restores the routing database for a router with a new or restarted control plane without consuming excess resources.
More generally, there is a need for synchronizing portions of a database relevant for a particular node in a network with different databases on different nodes that does not suffer the disadvantages of the approaches described above.