The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Border Gateway Protocol (BGP) is a path vector routing protocol for inter-Autonomous System routing. The function of a BGP-enabled network element (a BGP host or peer) is to exchange network reachability information with other BGP-enabled network elements. The most commonly implemented version of BGP is BGP-4, which is defined in RFC1771 (published by the Internet Engineering Task Force (IETF) in March 1995).
To exchange routing information, two BGP hosts first establish a peering session by exchanging BGP OPEN messages. The BGP hosts then exchange their full routing tables. After this initial exchange, each BGP host sends to its BGP peer or peers only incremental updates for new, modified, and unavailable or withdrawn routes in one or more BGP UPDATE messages. A route is defined as a unit of information that pairs a network destination with the attributes of a network path to that destination. The attributes of the network path include, among other things, the network addresses (also referred to as address prefixes or just prefixes) of the computer systems along the path. In a BGP host, the routes are stored in a Routing Information Base (RIB). Depending on the particular software implementation of BGP, a RIB may be represented by one or more routing tables. When more than one routing table represents a RIB, the routing tables may be logical subsets of information stored in the same physical storage space, or the routing tables may be stored in physically separate storage spaces.
As defined in RFC1771, the structure of a BGP UPDATE message accommodates updates only to Internet Protocol version 4 (IPv4) unicast routes. The Multiprotocol Extension for BGP defined in RFC2858 (published by IETF in June 2000) could accommodate updates to routing information for multiple Network Layer protocols, such as, for example, Internet Protocol version 6 (IPv6), Internetwork Packet eXchange (IPX), Appletalk, Banyan Vines, Asynchronous Transfer Mode (ATM), X.25, and Frame Relay. RFC2858 introduced two single-value parameters to accommodate the changes to the BGP UPDATE message structure: the Address Family Identifier (AFI) and the Subsequent Address Family Identifier (SAFI).
The AFI parameter carries the identity of the network layer protocol associated with the network address that follows next in the path to the destination. The SAFI parameter provides additional information about the type of the Network Layer Reachability Information that is included in a BGP UPDATE message, and the values defined for this parameter usually indicate a type of communication forwarding mechanism, such as, for example, unicast or multicast. While some of the AFI and SAFI values are reserved for private use, the AFI and SAFI values that can be commonly used by the public must be assigned through the Internet Assigned Numbers Authority (IANA). The AFI/SAFI combination is used by the software implementations of BGP to indicate the type of the BGP prefix updates, what format the prefix updates have, and how to interpret the routes included in the BGP UPDATE messages.
As networks grow more complex and the number of BGP routes maintained by a particular element increases, the consequences of the failure of a BGP host device, or the BGP process that it hosts, become more severe. For example, in some scenarios a BGP failure may require retransmission of a large amount of route information and re-computation of a large amount of network reachability information. Therefore, vendors of network gear and their customers wish to deploy BGP in a fault-tolerant manner.
One term sometimes applied to fault-tolerant information transfer techniques is “stateful switchover” or SSO. SSO is typically implemented with network elements that have dual route processors, each of which can host separate but duplicate instances of various software applications. One route processor is deemed Active and the other is deemed Standby. Implementing SSO for BGP hosts, processes or “speakers” typically requires periodically transferring duplicate copies (“checkpointing”) of large amounts of data among pairs of hosts each respectively acting as an Active BGP speaker and a Standby BGP speaker. Further, when a failure occurs, the Standby BGP speaker almost always restarts operation asynchronously in relation to the Active BGP speaker. Consequently, all data accumulated by the Active BGP speaker must be transferred to the Standby BGP speaker before the Standby BGP speaker can start processing BGP UPDATE messages or perform other substantive functions.
However, such a bulk data transfer approach is inefficient, may not be sustainable as the volume of data grows, and is not extensible. For example, in the bulk data transfer approach, the data structures that are transferred must be converted to messages for purposes of inter-process communications. Therefore, all data structures have to be flattened; that is, pointers present in the data structures cannot be sent in the form of pointers. Further, as data structures change between versions of software, new messages and converter functions are necessary to provide SSO support between the different versions. Also, large amounts of code need to be written and maintained for providing the checkpointing support during asymmetric startup.
All the above drawbacks have seriously limited the success of any BGP SSO design and implementation. Thus, there is a clear need for an improved technique for transferring state information among BGP speakers that implement SSO.