1. Field of Invention
Embodiments of this invention relate to the field of Internet Protocol (IP) networks, Transmission Control Protocol (TCP), Peer-to-Peer protocols and more particularly to the traversal of firewalls and network address translators (NATs) for TCP connections.
2. Discussion of Related Art
The Internet has been designed around a client server model where many clients initiate a connection to communicate with a single server, for example many web-browsers accessing a web-server, which is know as a client-server model and protocols to support this model have been develop such as HTTP, FTP, etc. However, this model is not well-suited for many types of applications where only two parties need to communicate, for example in an Internet voice or chat conversation, where one peer may initiate a connection with another peer. This model is known as a peer-to-peer model.
Recently, the Internet Engineering Task Force (IETF) and other private organizations have developed peer-to-peer protocols such as Session Initiated Protocol, H.323, Gnutella, etc. to enable peer-to-peer communication. However, these peer-to-peer protocols and applications break when one or more peers exist behind a firewall and network address translator (NAT).
Firewalls and NATs provide many advantages for clients and the Internet itself, such as improved security and improved Internet Protocol (IP) address management and do not pose problems for many client-server applications (such as HTTP) since most servers are publicly available and the client initiates the connection with the server. However, these devices do cause problems for peer-to-peer protocols and applications. Firewalls and NATs conceal the identity of IP clients (i.e., peers) and block transmission control protocol (TCP) call setup requests. Firewalls and NATs make it impossible for one TCP peer to discover another and establish a connection with another. In effect, NATs and firewalls “blind” TCP peers from performing the necessary synchronization (i.e., handshaking) needed to setup a connection between two peers.
Transmission Control Protocol
A TCP connection contains three phases: connection establishment, data transfer and connection termination. The TCP protocol uses a Three Way Handshake protocol to synchronize and establish a connection between two TCP peers and once connected TCP hands off to the application for data transfer and communication.
While it is possible for a pair of end hosts to initiate a connection between themselves simultaneously, typically one end opens a socket and listens passively for a connection from the other. This is commonly referred to as a passive open, and it designates the server-side of a connection. The client-side of a connection initiates an active open by sending an initial TCP segment with SYN flag set to the server as part of the Three Way Handshake. The server-side should respond to a valid SYN request with a TCP segment with the SYN and ACK flags set. Finally, the client-side should respond to the server with a TCP segment with the ACK flag set, completing the Three Way Handshake and connection establishment phase.
The TCP Three Way Handshake protocol between a client and server is shown in FIG. 3. TCP functions by opening connections to a remote host and is thus connection-oriented. TCP maintains status information regarding the connections it makes and is therefore a reliable protocol guaranteeing data delivery, unlike the unreliable stateless User Datagram Protocol (UDP). A TCP connection is identified by the IP addresses and virtual port numbers used by both ends. During communication, additional numbers are used to keep track of the order or sequence number which indicates what order the segments of data should be reassembled. Finally, a maximum transmission size is constantly being negotiated via a fallback mechanism called windowing. The combination of port numbers, sequence numbers and window sizes constitutes a connection.
In the Internet protocol suite, TCP 703 is the intermediate layer between IP 705 below it, and an application 702 above it as shown in FIG. 7. Applications most often need reliable pipe-like connections to each other to transfer information. Applications send streams of 8-bit bytes to TCP for delivery through the network, and TCP divides the byte stream into appropriately sized segments (usually delineated by the maximum transmission unit (MTU) size of the data link layer of the network the computer is attached to). TCP then passes the resulting packets to IP, for delivery through an Internet to the TCP module of the entity at the other end. TCP checks to make sure that no packets are lost by giving each byte a sequence number, which is also used to make sure that the data is delivered to the entity at the other end in the correct order. The TCP module at the far end sends back an acknowledgement for bytes which have been successfully received; a timer at the sending TCP will cause a timeout if an acknowledgement is not received within a reasonable round trip time, and the (presumably lost) data will then be re-transmitted. The TCP checks that no bytes are damaged by using a checksum; one is computed at the sender for each block of data before it is sent, and checked at the receiver.
Firewalls and Network Address Translation
Most homes' and offices' Internet connections are managed by a router that performs firewalling and network address translating functions (NAT). These functions have become a standard features in routers for homes and small-offices even though according to specifications routers should not act as firewalls. However, it is a convenient and widely-used technique. The routers that perform these functions will be referred to as firewalls.
In a typical configuration, a local network uses one of the designated “private” IP address subnets (such as 192.168.0.x or 10.0.x.x), and a firewall on that network has a private address (such as 192.168.0.1) in that address space. The firewall is also connected to the Internet with a “public” address assigned by an ISP. As traffic passes from the local private network to the Internet, the source address on the packets are translated on the fly from the private addresses to a public address. The firewall tracks basic information about each active connection. When a reply returns to the firewall, it uses the connection tracking information it stored during the outbound phase to determine where on the internal network to forward the reply. To the sending host on the Internet, the firewall itself appears to be the source/destination for this traffic.
Clients behind firewalls do not have true end-to-end connectivity and cannot participate in some Internet protocols. Services that require the initiation of TCP connections from the outside network can be disrupted. Unless the firewall makes a specific effort to support such protocols, incoming packets cannot reach their destination.
There are four types of NAT operations used in firewalls [1]:                1. Full Cone        2. Restricted Cone        3. Port Restricted Cone        4. Symmetric        
For a given internal address, the first three types of NAT maintain a mapping of this internal address that is independent of the destination address. The fourth type of NAT will allocate a new mapping for each independent destination address. Unless the NAT has a static mapping table, the mapping that opens when the first packet is sent out from a client through the NAT may only be valid for a certain amount of time unless packets continue to be sent and received on the mapped port of the firewall.
Full Cone
In the case of the full cone, the mapping is well established and anyone from the public Internet that wants to reach a client behind a NAT only needs to know the mapping scheme in order to send packets to it. For example, an internal host behind a NAT with IP 10.0.0.1 sending on port 7400 and receiving on port 7450, is mapped to the external IP address and port on the firewall of 68.14.125.248 and 12867. Anyone on the Internet can send packets to that IP address and port and those packets will be passed on to the client machine listening on port 7450.
Restricted Cone
In the case of a restricted cone NAT, the mapping to the external IP address and port pair is only opened up once the internal client sends out data to a specific destination IP address. Unlike a full cone NAT, an external host can send a packet to an internal host only if the internal host has previously sent a packet to the specific external host. However, since the mapping only depends on the internal host IP address and port, packets from different external hosts will both use the same mapping through the NAT.
Port Restricted Cone
A port restricted cone NAT is almost identical to a restricted cone NAT, but in this case the restriction includes port numbers. Therefore, the firewall will block packets from an external host with source IP address (216.239.14.68) and port (6478) until the internal host has sent a packet to that IP address and port. Again since the mapping only depends on the internal host IP address and port, if the client has sent out packets to multiple IP address port pairs, they can all respond to the client and all of them will respond to the same firewall mapped port.
Symmetric
A symmetric NAT is the most restrictive and is different from the first three in that a specific mapping depends on internal host—source IP address and port—as well as external host—destination IP address and Port. As in the case of the restricted NAT, the external mapping to the firewall IP address and port pair is only opened up once the internal host sends out data to a specific destination.
NAT with port address translation typically involves source address translation (SNAT), which maps the IP address and port of the internal host which initiated the connection to the firewall IP address and port; and its counterpart, destination address translation (DNAT) which reverses the mapping. In practice, both are usually used together in coordination for two-way communication.
The TCP handshaking protocol between a client in a private network and a public server is show in FIG. 4. No special setup is required for setting up a connection in this configuration.
Standard NAT Behavior
The function of a NAT is to translate from internal addresses and ports to an external firewall addresses and ports. A NAT by definition is not a firewall and therefore should do the simplest operation possible to achieve its objective. The default behavior is to alter the connection as little as possible [6]. This means that NAT will attempt to preserve the original source port and will not change it unless that port on the firewall is currently being used by another connection. If this source port is already allocated NAT will attempt to find the next highest value in its group 0-511, 512-1023 or 1024-65535 [6].
TCP Connection Management in Firewalls
The Transmission Control Protocol (TCP) is connection-oriented and stateful, which makes it much more manageable by a firewall than the connectionless User Datagram Protocol (UDP). For this reason, many firewalls do not allow UDP traffic to enter or leave their private networks.
Firewalls that mange the state of a connection (i.e., stateful) are inherently more secure than “stateless” counterparts which simply perform packet filtering. These firewalls track the state of the connection which should not be confused with the state of the TCP connection. Connection tracking refers to the ability to maintain state information about a connection in memory tables, such as source and destination IP address and port number pairs, protocol types, connection state and timeouts.
A typical stateful firewall is shown in FIG. 2. This diagram illustrates the general operation of most firewalls 200 where packets can be generated 205 and received 203 locally, as is the case when the firewall is executed on the local host itself, or received and forwarded to and from remote hosts, as is the case when many hosts share one firewall. All packets that are received from remote hosts whether private internal hosts or public external hosts, travel through the remote interface 201. Each packet that is received on the remote interface is processed by the Pre-Routing stage 202 where it may be destination network address translated (DNAT) if it needs to be routed through the private network or onto the local machine. If the packet is not destined for the local machine, then it is processed by the Forward stage 204 and the Post-Routing stage 206. In the Forward stage, rules may be applied on how to filter each packet, and at the Post-Routing stage source network address translation (SNAT) may be applied if it needs to be routed through the public network. If a packet is destined for the local machine then it is processed by the Input stage 203, and if the local machine generates it then it is processed by the Output stage 205 and next processed by the Post-Routing stage. At any stage, filter rules may be applied that drop and change packet information. It is assumed that the firewall uses stateful inspection by managing a connection-tracking table for all connections through the firewall.
An entry in the connection-tracking table contains information as to the source IP address and port and the destination IP address and port as well as the state of the connection. The possible states of each connection is are INVALID meaning that the packet is associated with no known connection, ESTABLISHED meaning that the packet is associated with a connection which has seen packets in both directions, NEW meaning that the packet has started a new connection or otherwise associated with a connection which has not seen packets in both directions, and RELATED meaning that the packet is starting a new connection, but is associated with an existing connection, such as an FTP data transfer. This state information may be used in designing filter rules for how TCP connections are managed. Once a connection in the firewall is in ESTABLISHED state, the connection may be valid and exist for days even without any data exchange.
Filter rules in firewalls are typically designed so that peers (i.e., clients) in the private network cannot receive incoming TCP connection setup requests and therefore cannot set up a connection. Firewalls block connections to peers by dropping special packets that are involved in the handshaking process of establishing a TCP connection. Firewalls effectively “blind” the TCP peers from seeing each other.
Firewall and NAT Traversal for TCP Connections
Firewalls and NAT devices are located at the edge of virtually all business and most residential DSL and cable modems bundle firewalls and NATs. Therefore, the firewall and NAT traversal problem is one that affects both business users and residential users who use Internet applications that employ UDP or TCP for Internet communication. TCP is a connection-oriented transport protocol that makes it more reliable and secure compared to UDP. For this reason, many Internet applications require a TCP connection for communication. Furthermore, because the lack of connection set up and therefore difficulty to manage in a firewall, many firewalls are configured not to allow UDP traffic to enter or leave the private network. Therefore, firewall and NAT traversal for TCP connections is a challenge that must be solved in order to deliver public IP-based services. Overcoming this traversal problem will lead to widespread deployment of IP services to any subscriber with a broadband connection.
The firewall and NAT traversal problem for TCP connections is much more difficult than for UDP since TCP is connection-oriented and it has a well-defined connection protocol that is effectively managed by the firewall. In addition, the two TCP layers themselves need to be synchronized.
This problem can be broken down into three equally important components:                1. Traverse the NAT        2. Establish a bi-directional TCP connection in each peer firewall        3. Synchronize the peer TCP layers        
NAT traversal is a well-known and studied problem for UDP traffic [1,2,4,5 ]. The basic problem, which is common to both UDP and TCP, is that peers behind NATs are not reachable by external hosts. For TCP however, the problem is compounded since the firewall filters TCP control messages making it even more difficult to establish a NAT mapping in the firewall.
The role of the firewall is to protect the network from being accessed by unauthorized sources. It does this by decisions based on the direction of traffic flow. Typically in a private network, incoming traffic is only allowed if the connection was initiated from a device on the internal private network. Therefore, establishing a bi-direction TCP connection in a firewall is not a problem when the connection source (initiator) is behind the firewall and the destination is publicly available. In this case, the firewall will allow TCP segments with SYN flags set to leave network but not enter the network. Furthermore, only TCP segments with a SYN flag set can create a connection in a firewall, and the connection does not reach its ESTABLISHED state until the final TCP segment with ACK flag set is sent out (see FIG. 4). However, when both hosts are behind firewalls, there is a deadlock situation since neither host can initiate the connection. Any approach to solving this problem must allow secure two-way communication without any changes to firewall filtering rules, or reducing the current level of security provided by the firewall.
Two TCP peers need to exchange a series of control segments in order to setup a connection and exchange data, as shown in FIG. 3. When the TCP peers are behind firewalls and NATs, they cannot receive these control messages that are needed to establish a TCP socket connection so that the application can send and receive data. These TCP layers are blinded and need to be synchronized before a socket can be used by the application.