The present invention relates generally to network communications, and more particularly, to the architecture and methods for Transmission Control Protocol (TCP) High Availability (HA).
TCP is a lower level connection protocol. It is used in a router by higher level routing protocols such as Border Gateway Protocol (BGP) for setting up connections with peer routers and exchanging route information with them. In a router with an Active Main Board (AMB) and a Standby Main Board (SMB), TCP and other protocols such as BGP and Label Distribution Protocol (LDP) run on AMB and SMB. TCP HA provides the support for the high availability of those protocols that use TCP. When AMB fails, SMB will take over the role as a new AMB smoothly if TCP and other protocols have the high availability capability.
The existing architecture and methods for Transmission Control Protocol (TCP) High Availability (HA) use a message flow-through or mirror based mechanism. For the high availability system using the message flow-through mechanism, it delivers incoming TCP data to SMB first and then transfers the data to AMB through an internal Inter-Process Communication (IPC). For outgoing TCP data originated from an application in AMB, the data is transferred to SMB first through the internal IPC and then delivered to their destinations through line cards.
During the normal operation of the conventional TCP HA, an incoming TCP data stream containing application or routing protocol messages from a peer router is transmitted through both AMB and SMB. It flows through SMB first and then flow through AMB. Thus SMB can read the incoming data stream, decode the messages in the stream and obtain the state changes before it reaches AMB. Similarly, an outgoing TCP data stream containing application or routing protocol messages is also transmitted through both AMB and SMB. It flows through AMB first and then flows through SMB to a peer router. SMB can read the outgoing data stream, decode the messages in the stream and infer the state changes before it reaches the peer router. In this way, AMB and SMB are synchronized. FIG. 1 shows the architecture of flow-through based TCP HA.
In addition, both AMB and SMB maintain replicated sets of output buffers for the outgoing TCP data stream. These buffers store the fragments of data that are transmitted to the peer router. If an acknowledgement for a transmitted fragment is received from the destination peer router, the fragment is deleted from the buffers in both AMB and SMB. If no acknowledgement is received for the fragment transmitted within a predetermined time period, the fragment is retransmitted.
There are a few of issues in the existing solutions of prior art. At first, the existing solutions are not reliable in the sense that the problems in SMB may have impacts on AMB. For example, crashes of SMB may affect AMB since the TCP data streams flowing through SMB are broken. In addition, the existing solutions consume lots of internal IPC bandwidth. This may lead to congestions inside the router. Moreover, every incoming and outgoing TCP data stream takes an extra hop to its destination. This extra hop is from AMB to SMB for an outgoing data stream and from SMB to AMB for an incoming data stream.
Therefore, there is a need of a system that provides a reliable, efficient and simple solution for TCP High Availability. There is a further need of a system for TCP High Availability that reduces consumption of IPC bandwidth.