The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Border Gateway Protocol (BGP) is a protocol for exchanging routing information between gateway hosts (each with its own router) in a network of autonomous systems. Routers employing BGP interact with peers by establishing TCP sessions. A router may be peered with another router in another domain using External Border Gateway Protocol (EBGP) or with another router within a domain using Internal Border Gateway Protocol (IBGP). In either case current implementations of BGP (including implementations using a network operating system, or IOS) enable the TCP property called RETRANSMIT_FOREVER, which is used to block TCP from tearing down the session even if there is data in the TCP retransmit queue and retransmissions are failing.
One problem that occurs with use of RETRANSMIT_FOREVER is that when the retransmission queue becomes empty, such “idle” sessions are not torn down. These idle sessions continue to exist, using up resources to track and maintain them.
One approach to addressing this issue is to provide an application level “keepalive” mechanism to detect session related problems that require the session to be terminated. This “keepalive” mechanism terminates a session when a specified number of successive keepalive messages are lost. In other words, if no keepalive message is received for the duration of a specific period of time, called the ‘holdtime,’ the session is terminated. The values of keepalive time and holdtime are configurable. The default is 60 seconds for keepalive time and 180 seconds for holdtime.
Unfortunately, this approach has disadvantages. In order to quickly detect peer BGP application failures, many customers set the holdtime and the keepalive time to values in the order of a few seconds. In today's high speed networks, however, both the defaults and the retuned values that are in the order of seconds are very long times. Thus, even with re-tuning these values to the order of seconds, the idle sessions continue to place a large burden on BGP implementations in terms of processing power and scalability of the number of BGP sessions that a router can support.
Based on the foregoing, there is a clear need for a mechanism that will enable detection of session failures with improved speed relative to conventional techniques. Further, it is desirable that the failure detection mechanism will not adversely affect BGP scalability.