Real-time communications including audio and/or video conferencing can be difficult to implement on communications networks such as packet-based networks, including internet protocol (“IP”) networks, without compromising existing security mechanisms. Currently proposed solutions require either substantial effort and/or security risks, or are dependant upon specific conferencing platforms.
Communications networks such as computer networks rely on particular protocols to transport data. Packet-based networks, with IP networks being one example, may rely on UDP, TCP, and similar protocols to transport data such as audio or video in a streaming real-time conference. Both of these protocols use the concept of ports to perform multiplexing for higher-level protocols and applications. This involves the presence of two 16-bit identifiers in every packet indicating the sending (source port) and receiving (destination port) processes.
Some applications/protocols are assigned well-known or standard port numbers. However, a static port assignment may be impractical for applications such as multi-user conferencing applications due to their dynamic nature. As a result, most conferencing applications dynamically select ports from some predefined range specific to each application. This port allocation scheme can be problematic when a security device such as a firewall is present on the network.
A firewall may provide security by selectively granting or denying access to the private network. A firewall may be a NAT, a proxy, other device(s), and/or a combination of two or more of these. These devices block or restrict unauthorized incoming data and unauthorized incoming requests from devices on a private network. Firewalls isolate devices “behind” them from a public network, and thereby provide security against unsolicited connections. Firewalls can also restrict the way computers inside can access outside public sites, such as those on the Internet. One technique for establishing a firewall is to maintain a list of “authorized” addresses. Address information contained in a data packet from a remote device can be examined to determine whether the originating source is on the list. Only packets coming from authorized addresses are allowed to pass.
In many two-way streaming data applications, with videoconferences being an example, bi-directional communication must be initiated from behind a firewall towards a public network address. Once connections have been established from the inside out, data may then flow in both directions. Methods for initiating and maintaining a videoconference session through a firewall can be complex.
Widely used standard protocols such as H.323 are used to support some exemplary applications. For example, ITU H.323 standard defines how real-time, bi-directional multimedia communications can be exchanged on packet-based networks. The H.323 protocol uses a User Datagram Protocol (UDP) for the transport of voice and video data. UDP is a connectionless packet-oriented transfer protocol. When a public network transmission uses a connectionless type of protocol such as UDP for voice and video data packets, a security device may block incoming packets from the public network, and may also block outgoing from the private network. Accordingly, security devices such as firewalls can make applications such as videoconferences difficult to implement and use.
There are some proposed methods of enabling applications such as videoconferencing to operate through security devices. One proposed method is to configure the security device or firewall to always allow bi-directional communication on all ports associated with the application of interest. While this is relatively simple, statically opening ports significantly decreases the effectiveness of the firewall. Another proposed solution is to use an Application Level Gateway (ALG) or Middlebox Communication (MIDCOM) device. These are both essentially firewall add-ons that dynamically open and close ports based on information from higher-level protocols. These require an ALG or MIDCOM module specific to the particular application to be configured and installed on every firewall to be used. This is time consuming, complicated, and costly.
Another proposed solution is to bypass the firewall through the use of a gateway placed in a demilitarized zone (“DMZ”). A DMZ is a section of network that lies between an internal and external firewall such that nodes within the DMZ can be accessed without restriction. The gateway then connects nodes from outside the external firewall to nodes inside the internal firewall. This can cause an extra layer of complexity, and increases overhead and security risk. Still another proposed solution is the use of tunneling. This involves encapsulating the conference's stream(s) in some other protocol that is firewall friendly. This adds a sizeable layer of complexity to the application, decreases the effective payload of each data packet, and otherwise decreases performance efficiency. It is also possible for undesirable or dangerous traffic to be encapsulated in this way masquerading as a different protocol.
Other security devices include network address translators (“NAT”), and proxies. NAT's are found on many networks that interface with other networks, including public networks such as the Internet. A NAT may operate in combination with another security device(s), and may, for example, be one component of a firewall. NAT's provide security from the outside public network by translating internal network addresses on outgoing data packets so that they appear as a different address when viewed from outside the NAT. In addition to providing security, NAT translations can also alleviate problems related to the relatively small address space of IP by effectively sharing a few public IP addresses among many hosts.
NAT's commonly perform Network Address Port Translation (NAPT, a.k.a. PAT). This is the translation of a packet's originating client address to a different address that is unique on the public network. This source address data is typically contained in a packet header or external data. A packet sent from a client on the private network may have an originating address including an IP/port pair that is summarized as address=X. A NAT operating between the client and a public network could intercept this packet and replace the external originating address=X with a NAT translated address=Y. The packet would then be communicated into the public network with the external originating address=Y data. As a result, any recipient of the packet on the network will understand that it originated from address=Y. Typically, a NAT only translates fields in a data packet's external, as opposed to its internal, data. Accordingly, a UDP, TCP, or other protocol packet that included the client originating address in its payload would have its header address information translated by a NAT, but not the payload address data.
NAT translation can make it difficult (and in some cases, impossible) for a host on the public network such as a videoconference server to effectively communicate data such as two-way streaming audio and video data with a client. Because of NAT translation, the server receives data packets from the client with a NAT translated address attached. This can complicate communications for several reasons. For example, some communications sessions may be setup with the server through a request that is not subject to NAT translation, with the result that the server may have conflicting address data for the client.
A NAT may be combined with another security device, such as a proxy or proxy server (the terms “proxy” and “proxy server” are used interchangeably herein). A proxy may specifically act on data packets only of particular protocols, and may act on both incoming and outgoing data packets. A proxy may operate to translate address data therein, among other actions. When present in combination with a NAT, a proxy can further complicate conducting communications such as a two-way streaming data event.