1. Field of Invention
The present invention relates generally to network systems. More particularly, the present invention relates to enabling routers which do not support graceful restart to substantially transparently have their associated states recovered on a router which is being restarted and also performs graceful restart.
2. Description of the Related Art
The demand for data communication services is growing at an explosive rate. Much of the increased demand is due to the fact that more residential and business computer users are becoming connected to the Internet. Furthermore, the types of traffic being carried by the Internet are shifting from lower bandwidth applications towards high bandwidth applications which include voice traffic and video traffic.
As the demand for data communication services grows, the use of high availability networks is increasing. To this end, many networks are being built such that the components of the networks, e.g., routers, may continue to provide service even when the components must be reset or restarted. A component may generally be restarted when it has suffered a failure, e.g., a control module failure, or when a software upgrade is in process. In some cases, a graceful restart may be used to restart a component substantially while enabling the component to continue to function.
Networks generally include a plurality of components or peers which are in communication. FIG. 1 is a diagrammatic representation of a network. A network 100 includes peers 104, e.g., routers or hosts, which are in communication across connections 108 using a Border Gateway Protocol (BGP) over a Transmission Control Protocol (TCP). Typically, sessions may be established using connections 108 such that one peer 104 may exchange routing information packets with another peer 104, e.g., peer 104a may establish a session with peer 104b. When peer 104a suffers a failure, connections 108a, 108c, 108d effectively go down, and peers 104b, 104d, 104e may no longer completely trust information that was received from peer 104a. 
When all peers 104 support a BGP Graceful Restart, a graceful restart may be accomplished to enable traffic to continue to be routed through peer 104a even when peer 104a has suffered a BGP failure and is in the process of restarting. As will be appreciated by those skilled in the art, a graceful restart enables data-forwarding to continue such that packets may be processed and forwarded through peer 104a when BGP on peer 104a is being restarted, i.e., even when a portion of peer 104a which is responsible for identifying best paths has failed. By way of example, when peer 104a fails, graceful restart enables peers 104b, 104d, 104e to wait for peer 104a to come back online, since although peer 104a has gone down, after a certain amount of time, peer 104a will be back online. Peer 104a effectively requests that peers 104b, 104d, 104e not remove any information from peer 104a. 
During a graceful restart, a restarting peer, e.g., peer 104a, may set a restart bit to indicate that it has restarted, and may set a forwarding state bit to indicate that it has preserved or otherwise maintain its forwarding state. The preservation of the forwarding state allows peer 104a to restart while peers 104b, 104d, 104e may maintain their routes through peer 104a. In other words, a graceful restart is a substantially transparent process that allows peers 104b, 104d, 104e to effectively hide the restart of peer 104a from the rest of network 100 in terms of packet forwarding only.
As previously mentioned, peers 104 may be routers. With reference to FIG. 2, the configuration of routers will be described. A first router 202 may be in communication with a second router 206 over an interface 210, e.g., a connection. Router 202 has an active route processor 214, and a standby route processor 218. Active route processor 214, or an active route switch processor, controls and runs routing protocols. Standby route processor 218, or a standby route switch processor, is arranged to take over the functions of active route processor 214 when active route processor 214 experiences downtime.
Both active route processor 214 and standby route processor 218 include a BGP speaker 222 and a TCP speaker 226, as will be appreciated by those skilled in the art. Active route processor 214 and standby route processor 218 are substantially connected to linecards 232 through a bus 230. Linecards 232 are arranged to allow interfaces such as interface 210 to enable communication between router 202 and other routers, as for example router 206, which may include the same internal components as router 202, as shown.
As will be understood by those skilled in the art, while the above description of a typical router 202 mentions a separate active route processor and a standby route processor, router 202 may instead include an active stack of BGP and TCP, and a standby stack of TCP and BGP on the same physical route processor.
If router 202 needs to be restarted and both routers 202, 206 have the capability to support a graceful restart, then a graceful restart may occur such that packet forwarding between router 202 and router 206 may essentially remain unaffected while the graceful restart occurs. During a graceful restart of router 202, router 202 will inform router 206 to wait for a certain period of time before removing its associated routes from router 202 and allowing packet forwarding to continue. If router 202 comes back on line within the certain period of time and if a route associated with router 206 is received again, then forwarding for that associated prefix is effectively unaffected.
In order for both router 202 and router 206 to support a graceful restart, both router 202 and router 206 must support the protocol extensions required by a graceful restart. That is, both router 202 and router 206 must both be upgraded to have the software that supports a graceful restart. When router 202 and router 206 are both owned by a common service provider, then ensuring that the relevant software is of the same version on both routers 202, 206 may be relatively easy. Even in such a case, an upgrade may occur at different times. However, when router 202 is owned by a service provider and router 206 is owned by a customer, for instance, it may be difficult to ensure that both routers 202, 206 have the relevant version of the software. In some situations, the relevant software in router 206 may not be upgradeable in the same time frame as the relevant software in router 202. Further, in other situations, it may not be possible to upgrade the relevant software in router 206. There may also be situations where a service provider does not wish to provide any information about internal failures at all.
Some networks use a full stateful switchover solution in which all TCP and BGP states are substantially synchronized on an active route processor and a standby route processor of a router at all times. Stateful switchover generally allows for a standby route processor to take control of a failed active route processor while maintaining connections which were established by the active route processor, and is one example of a failover method. A failover is generally an operational mode in which the functions of a component are assumed by standby subcomponents when active subcomponents become unavailable. Typically, maintaining the connections established by an active route processor is achieved at least in part by checkpointing data needed to maintain connections and functionality from the active route processor to the standby route processor. Although the use of a stateful switchover solution with all peers or routers associated with a router which has suffered a failure may be useful when graceful restart is not supported by all peers associated with the failed router, such a solution is not very scalable, and the cost and performance characteristics associated with the solution are often unacceptable.
Therefore, what is needed is a method and an apparatus which allows a failed router to efficiently maintain its functionality during a restart within a network when not all peers associated with the failed router support graceful restart. That is, what is desired is a system which allows for a failover to occur with a relatively high level of performance within a network that includes peers which do not necessarily support graceful restart.