Many service providers have started to provide real-time network services, such as Voice over Internet Protocol (VoIP), Internet Protocol (IP) television, and so forth, to their customers through the providers' communications systems. Due to the real-time nature of these services, it may be very critical for the service providers to provide continuous services at a satisfactory level to their customers. Hence, the service providers need to keep their network services always available for their customers.
A router is a major building block in a communications system that provides real time network services to customers. There are a number of routing protocols, such as Open Shortest Path First (OSPF) and Border Gateway Protocol (BGP), running in a router. These protocols maintain the latest network topology and calculate the optimal routes to each destination in the communications system. A Routing Table Manager (RTM) in the router maintains all the routes calculated by each routing protocol in a routing table and selects the routes with higher preference to each destination. All these selected routes are stored by a Forwarding Table Manager (FTM) in a Forwarding Information Base (FIB) for forwarding packets.
Generally, each router in a communications system has a consistent view of the topology of the communications system with any other router in the communications system by executing routing protocols to exchange the information about the communications system with its adjacent routers. Thus each router has a consistent FIB with any other router in the communications system and properly forwards packets towards to their destinations. If there is any inconsistency in the FIB or the view of the topology of the communications system, a routing loop(s) may occur and real-time network services may get interrupted.
A common router with high availability consists of an Active Main Board (AMB) and a Standby Main Board (SMB). A RTM and routing protocols, such as OSPF and BGP, run on both the AMB and the SMB. In some routers, different software components, such as the RTM and the BGP, may run as separated processes. A critical problem that may happen in the communications system is that it may take a long time for a router to recover from a failure in the router through techniques, such as failed component switchover, since a synchronization of a large amount of data between some processes, for example, millions of BGP routes in the routing table between the RTM and the BGP, may take a long time. During the recovery, the router may have inconsistent FIB or the view of the topology of the communications system with respect to other routers in the network. The inconsistency may cause routing loop(s), as well as real-time network service degradation or interruption.
Therefore, there is a need of a system and method for quick recovery from a failure in a router. Thus the router with the failure will instantly (or very shortly) have a consistent FIB and view of the communications system with other routers in the network after performing a failed component switchover. Real-time network services provided by the communications system will not be significantly affected by the failure within the router.