One of the most common methods of transferring a data stream within the Internet is TCP/IP. TCP being Transmission Control Protocol and IP being Internet Protocol. Some examples of a data stream are: an email message, a file transfer or streaming video. Data streams are broken into packets. Each packet contains a portion of data to be transmitted from one computer or “node”, to another. TCP/IP allows a data stream to be split into packets.
IP is unreliable in the sense that packets may be lost or shuffled and there is no positive or negative confirmation of receipt. However, when combined with TCP, TCP/IP reliably transfers data between two nodes in correct sequence. A single TCP/IP session allows bidirectional data streams to be sent between nodes. Once initiated, communication is symmetrical between the two nodes as each node acts in both sender and receiver roles simultaneously.
Thus, TCP/IP is a two layer program. The higher layer, Transmission Control Protocol, manages the assembling of a data stream into smaller packets that are transmitted over the Internet and received by a TCP layer that reassembles the packets into the original message. The lower layer, Internet Protocol, handles the address part of each packet so that it gets to the right destination.
Bytes in a TCP/IP data stream are consecutively numbered with a 32 bit number known as a sequence number. This allows the receiver to identify duplicate or missing data. The sender knows that all data prior to a given sequence number has been received when the sender gets an acknowledgement (ACK) for the sequence number. For each connection the sequence number begins at a random value as chosen arbitrarily by a sender. A synchronization (SYN) packet identifies the initial value.
Often one node in a TCP/IP connection may be on the Internet and a second node may be within a protected network such as corporate network or a network managed by an Internet Service Provider (ISP). In order to ensure that unwanted data does not enter nor leave a protected network, a proxy is typically installed between the Internet and the protected network to examine packets transmitted by both sides. Thus, a proxy acts as a peer to both computers it communicates with. It accepts a connection to each of the two nodes and passes data between the two connections.
A proxy also acts as a gatekeeper. A proxy may utilize many methods to determine if a packet may pass, such as:
a) filtering unwanted packets, i.e. those from undesired sources;
b) translating addresses of the packets, to ensure they are sent to a desired recipient;
c) scanning for correct format;
d) determining if a packet contains unwanted material, such as a virus; and;
e) preventing incoming connections to protected nodes.
The processing of multiple connections that each require a proxy is computationally expensive. Further, all unacknowledged data travelling in each direction must be kept in memory; this may result in the buffering of substantial amounts of data, particularly when receiving data from a fast connection. Finally, timers must be used to trigger retransmission of data, which hasn't been acknowledged by the receiving node. Thus, there is a need for a proxy that can modify data, minimize the buffering of data and limit the requirement of timers for each session. The present invention addresses this need.