The TCP/IP offload solution offloads the TCP/IP stack functionalities from one or more Host systems (running various applications) to a cluster of offload systems. This moves CPU bandwidth off the Host; that bandwidth is then processed off the Host that hosts the TCP/IP stack. The Host(s) and the TCP/IP offload engines (TOEs) to which the bandwidth is offloaded are connected by a high bandwidth low latency interconnect technology. The TOEs are computing systems that run their own operating systems, which in this dicussion are each assumed to be the Linux OS with a Linux TCP/IP stack.
While the network applications run on the Hosts, the TOEs host the TCP/IP stack and external connectivity to Internet (IP) networks through standard ethernet connectivity. This solution is transparent to the network applications on the Hosts and remote peer entities.
It is possible to have an offload architecture that allows multiple Hosts to share multiple TOEs. This leads to a practical architecture where there is a non-passive TOE, and one also obtains a single IP image for multiple Hosts. However, for the sake of simplicity the following discussion assumes an architecture of a single Host using two TOEs, one TOE acting as primary TOE and the other as alternate TOE. Furthermore, in the following discussion the alternate TOE is assumed to be passive with no data traffic flow on it, while the primary TOE is active.
The implementation of the typical TCP offload solution of the background art involves retaining the socket layer on the Host, and forwarding any further TCP stack processing to the TOE over high speed connectivity. As a part of the solution, specific socket applications on the TOE translate these requests from the Host and make the appropriate socket calls to the TOE TCP/IP stack. The Host connects to and uses only one of the TOEs (primary or alternate) and has the ability to detect failures in a TOE and, in response, switch to the other TOE. As discussed above, it is quite possible for the Host to communicate with multiple TOEs at the same time (each presenting its own TCP/IP stack) and, indeed, a single TOE may act as both primary and alternate TOE (assuming the role of multiple TOEs in a single system). However, reliable fail-over of TCP connections from one TOE to another TOE, while the Host applications continue to use the TCP connections transparently, generally require the arrangement described above of a single HOST, a primary TOE and a (distinct) passive, alternate TOE.
Hosts that offload the TCP/IP stack expect continuous availability of the TCP/IP stack functionality. The TCP/IP stack hosts the TCP connections and other socket parameters, and the failure of a TOE system requires fail-over to an alternate TOE system, without dropping the TCP connections. The applications and the TCP/IP stack do not run on the same CPU, so generally applications should not be affected if one of the TOEs fails. Existing technologies provide fault-tolerance of TCP connections using Ethernet bonding or Ethernet Aggregate, at the Ethernet level. Multiple TCP connection migration solutions have been proposed; these proposed solutions assume application migration and hence provided complex or partial solutions. However, migration solutions assume that the original system is still alive and accessible during the migration, but this assumption is invalid if the original system has failed.