1. Field of the Invention
This invention relates to Transmission Control Protocol (TCP), and more specifically to fault tolerant TCP handshaking.
2. Background Information
Many networks consist of a number of client machines and server machines interconnected through the network. Generally, clients make requests to servers for information across the network. The server responds by supplying the information to the client. Transmission control protocol (TCP) is a networking protocol that provides communication across interconnected networks between computing devices such as client machines and server machines. An initial exchange, known as handshaking, between a client and a server occurs prior to data transmission to ensure proper data transmission.
Moreover, a collection or group of network nodes interconnected together called a cluster may reside on the network. The cluster may process traffic between a client and server. Clustering increases the effectiveness and efficiency of security, administration and performance, and helps servers become fault resilient. All nodes in a cluster have external interfaces on one wire and their internal interfaces on another wire. The entire cluster behaves as if it were a single piece of hardware. Each cluster member listens on the external cluster address, therefore, seeing every packet addressed to the cluster.
FIG. 1 shows a block diagram of a network containing a first network node connected to a cluster of network nodes. Network node 10, i.e., client machine, may be connected to a cluster 20 via network 12. Cluster 20 may consist of two or more network nodes 22-28. Network nodes 22-28 have internal connections 14 as well as external connections 16 to the network 12. Cluster 20 may transfer traffic between a client network node 10 and a server node (not shown).
A cluster member may be operating in one of two states, a working resources state or a mirror state. If a member is handling a given TCP connection, the member may have various working resources allocated to it, e.g., sockets, memory buffers, etc. However, since any other member must be prepared to take over for that member at any given time, the other members must possess a mirrored state. A mirrored state is a passive state sufficient to recreate the working resources in order to handle the workload or TCP connection of another member. Only the minimal amount of state to allow other members to reproduce the original state upon fail over may be sent to each member in the cluster. Further, each individual member may not need to be configured with all state information. Once a member is configured as part of the cluster, state information and other configuration information may be propagated automatically.
In current systems, a problem occurs if the start of a TCP handshake is clustered. When this occurs, the entire cluster is made highly vulnerable to a SYN denial of service attack. This occurs when some attacker simply floods a targeted machine in a cluster with many bogus SYNs with source addresses that are randomly generated. If the fact that the targeted machine just received a SYN is clustered, too much memory is taken up for each individual SYN, and an attacker could cripple the cluster with a SYN flood.
During a typical TCP handshake, a client chooses a sequence number randomly and sends a SYN message with the sequence number to the destination (i.e., cluster). The cluster receives the SYN, picks a random sequence number and sends a SYN/ACK with this number back to the client. The client then acknowledges (ACKs) the cluster's sequence number.
However, a problem occurs if the node in the cluster handling the connection fails after it has sent the SYN/ACK. Then, a client may think that it has an open connection, but none of the members of the cluster know that this connection exists since this connection is starting and hasn't been clustered. The data during the cluster receipt of the SYN is not clustered as noted previously since an attacker could flood a cluster with random SYNs and cause the node in the cluster to exhaust all memory for bogus connections.
Therefore a need exists for methods and apparatus that handle situations when a node in a cluster fails during an initial TCP handshake.