1. Field of the Invention
The present invention relates to enabling clients to traverse firewall and NAT installations. In particular, the present invention uses probe packets between servers and clients to communication the addresses modified by the NAT.
2. Description of the Related Art
The following is a list of acronyms used in the present document.
ALGApplication Level GatewayASSENTAugmented Session Signaling Enabling NAT TraversalDMZDe-Militarized ZoneFWFirewallH.323ITU standard for packet based communications over non-QOSpacket networksITUInternational Telecommunications UnionIPInternet ProtocolNATNetwork Address Translation (RFC 1631)NAPTNetwork Address Port TranslationSIPSession Initiation Protocol (RFC3261)TCPTransmission Control Protocol (RFC793)UDPUser Datagram Protocol (RFC768)
The rapidly evolving IP (Internet Protocol) data network is creating new opportunities and challenges for multimedia and voice Communications Service Providers. Unprecedented levels of investment are being made in the data network backbone by incumbent telecommunication operators and next generation carriers and service providers. At the same time, broadband access technologies such as DSL and cable modems are bringing high speed Internet access to a wide community of users. The vision of service providers is to make use of the EP data network to deliver new voice, video and data services right to the desktop, the office and the home alongside high speed Internet access.
The H.323 standard applies to multimedia communications over Packet Based Networks that have no guaranteed quality of service. It has been designed to be independent of the underlying transport network and protocols. Today the IP data network is the default and ubiquitous packet network and the majority (if not all) of implementations of H.323 are over an IP data network. Other protocols for real-time (voice and video) communications, for example, SIP and MGCP also use the IP data network for the transport of call signaling and media. New protocols for new applications associated with the transport of real-time voice and video over IP data networks are also expected to be developed. The present invention relates to them, and other protocols that require multiple traffic flows per single session.
The importance of standards for wide spread communications is fundamental if terminals from different manufacturers are to inter-operate. In the multimedia arena, the current standard for real-time communications over packet networks (such as IP data networks) is the ITU standard H.323. H.323 is now a relatively mature standard having support from the multimedia communications industry that includes companies such as Microsoft, Cisco and Intel. For example, it is estimated that 75% of PCs have Microsoft's Netmeeting (trade mark) program installed. NetMeeting is an H.323 compliant software application used for multimedia (voice, video and data) communication. Interoperability between equipment from different manufacturers is also now being achieved. Over 120 companies world-wide attended the last interoperability event hosted by the International Multimedia Telecommunications Consortium (IMTC), an independent organization that exists to promote the interoperability of multimedia communications equipment. The event is a regular one that allows manufacturers to test and resolve inter-working issues.
Hitherto, there had been a number of barriers to the mass uptake of multimedia (particularly video) communications. Ease of use, quality, cost and communications bandwidth had all hampered growth in the market. Technological advances in video encoding, the ubiquity of cheap IP access and the current investment in the data network coupled with the rollout of DSL together with ISDN and Cable modem now alleviates most of these issues making multimedia communications readily available.
As H.323 was being defined as a standard, it was assumed that there would be H.323-H.320 gateways that exist at the edge of network domains converting H.323 to H.320 for transport over the wide area between private networks. Therefore, implementations of H.323 over IP concentrated on communications within a single network.
However, IP continues to find favour as the wide area protocol. More and more organizations continue to base their entire data networks on IP. High speed Internet access, managed Intranets, Virtual Private Networks (VPNs) all based on EP are commonplace. The IP trend is causing H.320 as a multimedia protocol to decline. The market demand is to replace H.320 completely with H.323 over IP. But perhaps the main market driver for transporting real-time communications over IP across the WAN (wide area network) is voice. With standards such as H.323 and SIP users had begun to use the Internet for cheap voice calls using their computers. This marked the beginning of a whole new Voice over IP (VoIP) industry that is seeing the development of new Vole products that include Ethernet telephones, IP PBXs, SoftSwiches and IP/PSTN gateways all geared at seamlessly delivering Vole between enterprises and users. H.323, SIP and MGCP are expected to be the dominant standards here.
Unfortunately, unforeseen technical barriers to the real-world, wide area deployment of H.323 and SIP still exist. The technical barriers relate to the communications infrastructure at the boundaries of IP data networks.
Consequently, today, successful implementation of multimedia or voice communications over IP are confined to Intranets or private managed IP networks.
The problems arise because of two IP technologies—Network Address Translation (NAT) and Firewalls. Security is also an issue when considering solutions to these problems. Where deployments of real-time communications over the data networks transverse shared networks (for example the public Internet), enterprises need to be assured that no compromise to their data security is being made. Current solutions to these problems require the outside or external IP address(es) of enterprise to become public to anyone with whom that enterprises wishes to communicate (voice communications usually includes everyone). The invention presented herein does not suffer this shortfall as enterprises external IP address(es) need only be known to the ‘trusted’ service provider which is how the public Internet has largely evolved.
NAT has been introduced to solve the ‘shortage of addresses’ problem. Any endpoint or ‘host’ in an IP network has an ‘IP address’ to identify that endpoint so that data packets can be correctly sent or routed to it and packets received from it can be identified from where they originate. At the time of defining the EP address field no-one predicted the massive growth in desktop equipment. After a number of years of global IP deployment, it was realized that the number of endpoints wanting to communicate using the IP protocol would exceed the number of unique EP addresses possible from the address field. To increase the address field and make more addresses available requires the entire IP infrastructure to be upgraded. (The industry is planning to do this with EP Version 6 at some point).
The solution of the day is now referred to as NAT. The first NAT solution, which is referred to as simple NAT in IETF RFC 1631, uses a one-to-one mapping, came about before the World-Wide Web existed and when only a few hosts (e.g. email server, file transfer server) within an organization needed to communicate externally to that organization. NAT allows an enterprise to create a private EP network where each endpoint within that enterprise has an address that is unique only within the enterprise but is not globally unique. These are private IP addresses. This allows each host within an organization to communicate (i.e. address) any other host within the organization. For external communication, a public or globally unique EP address is needed. At the edge of the private IP network is a device that is responsible for translating a private IP address to/from a public IP address—the NAT function. The enterprise will have one or more public addresses belonging exclusively to the enterprise but in general fewer public addresses than hosts are needed either because only a few hosts need to communicate externally or because the number of simultaneous external communications is smaller. A more sophisticated embodiment of NAT has a pool of public IP addresses that are assigned dynamically on a first come first served basis for hosts needing to communicate externally. Fixed network address rules are required in the case where external equipment needs to send unsolicited packets to specific internal equipment.
Today, most private networks use private 1P addresses from the 10.x.x.x address range. External communications are usually via a service provider that offers a service via a managed or shared IP network or via the public Internet. At the boundaries between the public and private networks NAT is applied to change addresses to be unique within the IP network the packets are traversing. Simple NAT changes the complete IP address on a one-to-one mapping that may be permanent or dynamically created for the life of the communication session.
Web Servers, Mail Servers and External servers are examples of hosts that would need a static one-to-one NAT mapping to allow external communications to reach them.
A consequence of NAT is that the private IP address of a host is not visible externally. This adds a level of security.
An extension to simple NAT additionally uses ports for the translation mapping and is often referred to as NAPT (Network Address Port Translation) or PAT (Port Address Translation). A port identifies one end of a point-to-point transport connection between 2 hosts. With mass access to the World-Wide-Web (WWW), the shortage of public IP addresses was again reached because now many desktop machines needed to communicate outside of the private network. The solution as specified in IETF RFC 1631, allows a many-to-one mapping of private IP addresses to public IP address(es) and instead used a unique port assignment (theoretically there are 64 k unique ports on each 1P address) on the public IP address for each connection made from a private device out into the public or shared network. Because of growth of the Internet, PAT is the common method of address translation.
A peculiarity of PAT is that the private IP address/port mapping to public 1P address/port assignments are made dynamically, typically each time a private device makes an outbound connection to the public network. The consequence of PAT is that data cannot travel inbound, that is from the public network to the private network, unless a previous outbound connection has caused such a PAT assignment to exist. Typically, PAT devices do not make the PAT assignments permanent. After a specified ‘silence’ period has expired, that is when no more inbound data has been received for that outbound initiated connection, the PAT assignment for that connection is unassigned and the port is free to be assigned to a new connection.
While computers and networks connected via a common IP protocol made communications easier, the common protocol also made breaches in privacy and security much easier too. With relatively little computing skill it became possible to access private or confidential data and files and also to corrupt that business information maliciously. The industry's solution to such attacks is to deploy ‘firewalls’ at the boundaries of private networks.
Firewalls are designed to restrict or ‘filter’ the type of IP traffic that may pass between the private and public IP networks. Firewalls can apply restrictions through rules at several levels. Restrictions may be applied at the IP address, the Port, the IP transport protocol (TCP or UDP for example) or the application. Restrictions are not symmetrical. Typically a firewall will be programmed to allow more communications from the private network (inside the firewall) to the public network (outside the firewall) than in the other direction.
It is difficult to apply firewall rules just to IP addresses. Any inside host (i.e. your PC) may want to connect to any outside host (a web server) dotted around the globe. To allow further control the concept of a ‘well known port’ is applied to the problem. A port identifies one end of a point-to-point transport connection between 2 hosts. A ‘well known port’ is a port that carries one ‘known’ type of traffic. LANA, the Internet Assigned Number Authority specifies the well known ports and the type of traffic carried over them. For example port 80 has been assigned for web surfing (http protocol) traffic, port 25 Simple Mail Transport Protocol etc.
An example of a firewall filtering rule for Web Surfing would be:
Any inside IP address/any port number may connect to any outside IP address/Port 80 using TCP (Transport Connection protocol) and HTTP (the application protocol for Web Surfing).
The connection is bi-directional so traffic may flow back from the Web Server on the same path. The point is that the connection has to be initiated from the inside.
An example of a firewall filtering rule for email may be:
Any outside IP address/any port number may connect to IP address 192.3.4.5/port 25 using TCP and SMTP.
(Coincidentally, the NAT function may change the destination IP address 192.3.4.5 to 10.6.7.8 which is the inside address of the mail server.)
Filtering rules such as “any inside IP address/any port number may connect to any outside IP address/any port number for TCP or UDP and vice versa” are tantamount to removing the firewall and using a direct connection as it is too broad a filter. Such rules are frowned upon by IT managers.
H.323 has been designed to be independent of the underlying network and transport protocols. Nevertheless, implementation of H.323 in an IP network is possible with the following mapping of the main concepts:
H.323 address: IP address
H.323 logical channel: TCP/UDP Port connection
In the implementation of H.323 over IP, H.323 protocol messages are sent as the payload in EP packets using either TCP or UDP transport protocols. Many of the H.323 messages contain the H.323 address of the originating endpoint or the destination endpoint or both endpoints. Other signaling protocols such as SIP also embed IP addresses within the signaling protocol payload.
However, a problem arises in that NAT functions will change the apparent IP addresses (and ports) of the source and destination hosts without changing the H.323 addresses in the H.323 payload. As the hosts use the H.323 addresses and ports exchanged in the H.323 payload to associate the various received data packets with the call, this causes the H.323 protocol to break and requires intermediary intelligence to manipulate H.323 payload addresses.
Because of the complexity of multimedia communications, H.323 requires several logical channels to be opened between the endpoint. Logical channels are needed for call control, capabilities exchange, audio, video and data. In a simple point-to-point H.323 multimedia session involving just audio and video, at least 6 logical channels are needed. In the IP implementation of H.323, logical channels are mapped to TCP or UDP port connections, many of which are assigned dynamically.
As the firewall functions filter out traffic on ports that they have no rules for, either the firewall is opened, which defeats the purpose of the firewall, or much of the H.323 traffic will not pass through.
Therefore, both NAT and firewall functions between endpoints prevent H.323 (and other real-time protocols, SIP and MGCP for example) communications working. This will typically be the case when the endpoints are in different private networks, when one endpoint is in a private network and the other endpoint is in the Internet or when the endpoints are in different managed IP networks.
H.323 (and SIP, MGCP etc.) communication is therefore an anathema to firewalls. Either a firewall must become H.323 aware or some intermediary intelligence must manipulate the port assignments in a secure manner.
One possible solution to this problem would be a complete 1P H.323 infrastructure upgrade. This requires:
H.323 upgrade to the NAT function at each LP network boundary. The NAT function must scan all H.323 payloads and consistently change 1P addresses.
H.323 upgrade to the firewall function at each EP network boundary. The firewall must understand and watch all H.323 communication so that it can open up the ports that are dynamically assigned and must filter all non-H.323 traffic on those ports.
Deployment of H.323 intelligence at the boundary or in the shared IP network to resolve and arbitrate addresses. IP addresses are rarely used directly by users. In practice, IP address aliases are used. Intelligence is needed to resolve aliases to an IP address. This H.323 function is contained within H.323 entities called Gatekeepers.
The disadvantages of this possible solution are:
Each organization/private network must have the same level of upgrade for H.323 communication to exist.
The upgrade is costly. New functionality or new equipment must be purchased, planned and deployed. IT managers must learn about H.323.
The scale of such a deployment will likely not be readily adaptable to the demands placed on it as the technology is progressively adopted, requiring a larger and more costly initial deployment than initial (perhaps experimental) demand requires.
The continual parsing of H.323 packets to resolve the simple NAT and firewall function places a latency burden on the signal at each network boundary. The latency tolerance for audio and video is very small.
Because there are a multitude of standards for real-time communication and each of the signaling protocols of those standards are different, an enterprise would need multiple upgrades—one for each protocol it wishes to use.
The media is expected to travel directly between enterprises or between an enterprise and a device in the public network. The consequence of this is that the EP addresses of an enterprise become public knowledge. This is regarded as a security compromise as any potential attacker must first discover the enterprises IP address as the first step to launching an attack.
As a result of these problems, the H.323 protocol is not being used for multimedia communications when there is a firewall and/or network address translation (NAT). One approach has been to place H.323 systems on the public side of the firewall and NAT functions. This allows them to use H.323 while also allowing them to protect the remainder of their network. The disadvantages of this are:
1. The most ubiquitous device for video communications is the desktop PC. It is nonsensical to place all desktop computers on the public side.
2. The H.323 systems are not protected from attackers on the public side of the firewall.
3. The companies are not able to take advantage of the potentially ubiquitous nature of H.323, since only the special systems will be allowed to conduct H.323 communications.
4. The companies will not be able to take full advantage of the data-sharing facilities in H.323 because the firewall will prevent the H.323 systems from accessing the data. Opening the firewall to allow data-transfer functions from the H.323 system is not an option because it would allow an attacker to use the H.323 system as a relay.
5. In the emerging Voice over IP (VoIP) market there is a market for telephony devices that connect directly to the data network, for example Ethernet telephones and IP PBXes. By virtue of the desktop nature they are typically deployed on the private network behind firewalls and NAT. Without solutions to the problems described above telephony using these devices is confined to the Enterprises private network or Intranet or must pass through IP-PSTN gateways to reach the outside world.
Thus, what is desired, as recognized by the present inventors, is an ability to allow endpoints (using a real-time protocol, for example H.323, SIP or MGCP) located in different secure and private IP data networks to be able to communicate with each other without compromising the data privacy and data security of the individual private networks.