A. Field of the Invention
The principles of the invention relate generally to packet-based telephony services, and more particularly, to handling of transmission of multimedia related messages across network protection devices, such as firewalls.
B. Description of Related Art
With the increasing ubiquity of the internet and internet availability, there has been an increasing desire to leverage its robust and inexpensive architecture for voice telephony services, commonly referred to as voice over IP (internet protocol), or VoIP. Toward this end, standards for internet telephony have been promulgated by the both the International Telecommunication Union Telecommunication Standardization Sector (ITU-T) in the form of H.323 rev 5 (2003), “Packet based multimedia communications systems” as well as the Internet Engineering Task Force (IETF) in the form of RFC 3261 (2002), “Session Initiation Protocol (SIP)” to enable set-up and teardown of the media sessions.
Under each of these standards, a call invitation message is initially routed between a calling party and a proxy server or H.323 gatekeeper (collectively, “proxy server”). The proxy server performs name resolution, call processing, number lookup, routing, and any other required processing of the call invitation message. The call invitation message also typically includes a session description portion that contains information about the media that the caller wishes to use for the session. The proxy server then forwards the call invitation message to the called party (sometimes via redirect servers or other intermediary entities). In response to the received invitation message, a response message having a similar session description portion may be returned to the calling party via the proxy server. When the calling party receives the response message, it forwards an acknowledgement message to the called party. This completes call setup and enables subsequent exchange of real-time media directly between the calling and called parties, or their agents (e.g., firewalls, etc.).
All of the messages exchanged are typically in the form of a packet of data having both header and payload information. With most forms of signaling information being contained in packet headers, information relating to the media being exchanged between the parties is typically contained within the payload portion. In addition, addressing information, such as Internet Protocol (IP) addresses, Uniform Resource Locators (URL's), Uniform Resource Identifies (URI's), or UDP addresses, etc. for both the calling and called parties may be contained in both the header and payload. The existence of addressing information in packet payloads has caused difficulties with respect to both firewall and network address translation (NAT) implementation.
In most modern network environments, firewalls constitute the main protection mechanism for keeping unwanted traffic away from a private network. In general, a firewall is positioned between the private network and the public network such that all traffic passing between the two networks first passes through the firewall. The traffic may then be subjected to various filtering policies which identify the types and sources/destinations of traffic permitted to flow based upon information contained within the packet headers. One example filtering policy may be to permit all outgoing traffic (e.g., to any destination address) from IP address 134.138.29.17 (the source address) on port 8080 (the source port). Conversely, incoming traffic to 134.138.29.17 on port 8080 may not be permitted unless initially requested by 134.138.29.17. By enabling the enforcement of these various policies, only known and identifiable types of network traffic may be allowed to enter or exit the private network, thereby providing security to the network.
Unfortunately, it is the rigorous and strict nature of most existing firewalls that typically prevent successful establishment of VoIP sessions. For example, addressing information relating to the media exchange between parties is typically contained with the session description portion of a VoIP packet's payload. For example, in a SIP session, addresses and related port(s) on which media is expected is included within the session description protocol (SDP) information found in the message's payload. This information is dynamically assigned upon generation of the each message and cannot be adequately predicted by the firewall. Accordingly, when media from either party is received at the firewall, its passage is denied because no enabling policy is identified. The alternative to blanket denial is to leave a wide range of ports unprotected to facilitate passage of the media. Clearly, this is untenable from a security standpoint. To remedy this issue, intelligent Application Level Gateways (ALG) may be implemented on the firewall which identify VoIP messages as they are received at the firewall. The VoIP messages are then parsed for information contained within their headers and payloads.
In addition to problems posed by the restrictive nature of firewalls alone, many firewalls also implement NAT. Generally speaking, NAT is a technology for enabling multiple devices on a private local area network (LAN) having private IP addresses to share a single, or pre-defined group of public IP addresses. Because the private IP addresses maintained by the devices are not routable from outside of the LAN, the NAT must perform translation between the private and public IP addresses at the point where the LAN connects to the public network.
In operation, when a device on the LAN wishes to initiate a connection with a device outside of the LAN, the device will send all traffic to the NAT first. The NAT examines the header of each outgoing packet and replaces the source or return address contained therein, which is the device's private address, with its own public address before passing the traffic to its destination on the Internet. When a response is received, the NAT queries the NAT table, identifies the proper recipient and passes the response to that device.
Unfortunately, existing NAT techniques fail to address issues surrounding unsolicited incoming packets, such as those associated with incoming VoIP calls. Additionally, because addressing information for VoIP traffic may also be contained within the payload information as well as the header of packets, existing NATs may fail to accurately translate all outgoing or incoming traffic, resulting in dropped or failed connections.