The Internet may be used for many forms of communication, including voice conversations, video conferencing, development collaboration, and the like. In order for a manufacturers' programs, applications, equipment, and systems to be interoperable with each other, many protocols have been developed to standardize the communication between such systems. These protocols have grown increasingly complex to handle all the types of traffic generated to facilitate communication for video conferencing, voice over Internet Protocol (VoIP), and data over Internet Protocol applications. Two such protocols are H.323 from the International Telecommunication Union—Telecommunication Standardization Sector (ITU-T) and the Session Initiation Protocol (SIP) from the Internet Engineering Task Force (IETF). Both H.323 and SIP, as well as Skype, Inter-Asterisk eXchange (IAX), and many other similar protocols, typically allow for multimedia communication including voice, video, and data communications in real-time.
H.323, SIP, VoIP, and the like are defined as application layer protocols of the Open Systems Interconnection (OSI) seven layer model. The layers of the OSI model include, from bottom to top, the physical, data link, network, transport, session, presentation, and application layers. Application layer protocols facilitate communication between software applications of devices providing a high level of abstraction from the details of sending information across a network, which are present at the lower layers of the OSI model. Examples of some commonly-used Application layer protocols include HyperText Transfer Protocol (HTTP), HTTP Secure (HTTPS), Simple Mail Transfer Protocol (SMTP), File Transfer Protocol (FTP), TELNET, Post Office Protocol version 3 (POP3), and Internet Message Access Protocol (IMAP).
In Internet Protocol (IP) communication networks, devices or endpoints on the network are usually identified by their respective IP address. Applications and programs on the different devices further identify each other using port numbers. A port number is a sixteen bit integer, the value of which falls into one of three ranges: the well-known ports, ranging from 0 through 1023; the registered ports, ranging from 1024 through 49151; and the dynamic and/or private ports, ranging from 49152 through 65535. The well-known ports are reserved for assignment by the Internet Corporation for Assigned Names and Numbers (ICANN) for use by applications that communicate using the Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) and generally can only be used by a system/root process or by a program run by a privileged user. The registered ports may be registered for use by companies or other individuals for use by applications that communicate using TCP or UDP. The dynamic or private ports, by definition, cannot be officially registered nor are they assigned. Both the H.323 and SIP standards, as well as many other such communication protocols, use multiple, well-known, registered, and/or dynamic ports in order to facilitate such communication.
H.323 and SIP each rely on multiple other protocols, some of which may in turn rely on UDP for sending and receiving multimedia traffic. UDP features minimal overhead compared to other transport protocols (most notably TCP) at the expense of having less reliability. UDP does not provide for guaranteed packet delivery nor data integrity. However, UDP does offer the highest possible throughput, thus, making it ideally suited for multimedia real-time communications.
Multimedia communications traffic will most likely encounter a firewall at some point during transmission, especially over the Internet, without regard to which protocol the traffic conforms. Firewalls are used in modern networks to screen out unwanted or malicious traffic. One of many techniques a firewall may use is packet filtering, wherein the firewall determines whether or not to allow individual packets by analyzing information in the packet header (such as the IP address and port of the source and destination). Thus, various ports or IP addresses may be blocked to minimize the risk of allowing malicious traffic into an important computer network or system. Another more advanced technique is called stateful inspection, wherein in addition to analyzing header information, a firewall keeps track of the status of any connection opened by network devices behind the firewall. Deciding whether or not a packet is dropped in a stateful inspection is based on the tracked status of the connection and information from within the packet header. In practice, firewalls (especially those used by large corporations) generally only allow traffic from the well-known ports, though such firewalls may be specially configured to allow traffic on any port. For multimedia communication systems that use multiple registered and dynamic ports, firewalls (unless specially configured) will generally block the data traffic on these ports between multimedia systems, thus, preventing communication.
Video conferencing endpoints generally use multiple dynamic ports for the transmission of communication data packets and, as such, each port used necessitates opening that port on a firewall. Additionally, different endpoints participating in different conversations use different sets of ports, further increasing the number of ports to be opened on a firewall. Reconfiguring ports on a firewall is a time consuming task that introduces the risk of human error, which may defeat the purpose of the firewall by leaving a network vulnerable to malicious attacks. Furthermore, even though these dynamic ports should be closed after the communication ends, in practice, once a firewall port is open, it remains open because the firewall technicians typically do not expend the additional time resources to close the ports.
In addition to firewalls, most large networks also typically deploy proxy servers that are used for many reasons, including reducing the number IP addresses that a computer network exposes to external networks and/or the Internet and monitoring the traffic sent between internal and external networks. In order to connect to an external resource or the Internet, an internal computer of a network often times connects to a proxy server, the proxy server then connects to the external resource, and data sent between the internal computer and external resource is sent through the proxy. As such, the external resource is aware of only the proxy server's IP address and not the internal computer's IP address. Thus, by having the internal computer connect to the proxy server in lieu of directly connecting to the external resource, proxy servers may also be used to monitor and in some cases prevent the flow of traffic between internal computers and external resources. As an example, a computer attempting to connect to an unauthorized external resource (e.g., an illicit website, an external computer that is not secure, or the like) may be refused a connection by the proxy server or a connection may be terminated if the traffic between the internal computer and external resource contains unauthorized data or does not conform to an authorized protocol.
The proxy server may also be used to authenticate the communication from the endpoints with the internal network. If an unknown endpoint attempts to obtain access to the external network through the proxy server, the proxy server will prevent that access if the endpoint fails the authentication. Many different authentication protocols are typically used by various proxy servers. Examples of such authentication protocols are base64 encoding, Microsoft Corporation's NT LAN MANAGER™ (NTLM™), INTEGRATED WINDOWS AUTHENTICATION™ (IWA™), and the like.
Each Application layer protocol has its own specification for the proper exchange of messages that allow access to resources between network devices or endpoints. Network devices and endpoints conform to this specification to both properly interoperate with each other and to successfully traverse a proxy server. Protocol violations by network devices or endpoints may be treated by a proxy server as a security threat and thus, run the risk of a proxy server terminating a connection, thwarting communication between network devices. As an example, a proxy server may be monitoring a connection wherein a malicious website sends malformed HTTP traffic in an attempt to exploit a weakness of an internal computer. Upon finding such malformed HTTP traffic, the proxy server may terminate the connection to stop the attack on the internal computer.
Many proxy servers are HTTP proxy servers that provide Internet connections for computers within a network. Typically, proxy servers operate in a standard mode that may only allow traffic conforming to commonly-used protocols (e.g., HTTP, FTP, and the like). Multimedia protocols, such as H.323, SIP, VoIP, and the like, which are themselves not commonly-used protocols and also rely on other protocols that are not commonly-used may be blocked or simply ignored as incompatible data streams. As an example, in the case where an H.323 endpoint is behind an HTTP proxy server, the H.323 connection requests may be denied or ignored while another device's HTTP connection requests are allowed.
Existing video conferencing systems such as TANDBERG's BORDER CONTROLLER™, a component of TANDBERG's EXPRESSWAY™ firewall traversal solution, requires the use of TANDBERG Gatekeepers or TANDBERG traversal enabled endpoints. While allowing firewall traversal, the EXPRESSWAY™ solution still requires either proprietary proxy servers or standard proxy servers to be reconfigured to trust, allow, or even understand the protocols used. The V2IU™ series of products from Polycom, Inc., are Application Level Gateways (ALG) that act as protocol-aware firewalls that automate the selection and trusting of ports, but as such, require either standard proxy servers to be reconfigured to trust, allow, or understand the protocols used when sending traffic between endpoints or to bypass altogether the standard proxy servers of a network. Further, such an ALG does not provide for secure communication. The PATHFINDER™ series of products from RadVision, Ltd., provides for firewall traversal via multiplexing to a single port, but still requires standard proxy servers to be either bypassed or reconfigured to trust and allow the traffic sent between endpoints.
Similar systems have been implemented for voice, VoIP, and data over IP communication systems. Each either relies on a proprietary system or equipment or relies on standard proxy servers being reconfigured to trust, allow, or understand the traffic sent between endpoints, which could leave the underlying network vulnerable to malicious electronic attacks.