The invention relates generally to the transmission of real-time audio data, especially and/or video information, and more particularly to systems and methods for video conferencing.
There are many systems and techniques for transmitting video information especially real-time video information combined with corresponding audio or other information such as document displays. Some effective conventional techniques involve special transmission and reception systems and require dedicated communication links to encode, transmit, receive and decode video or other information. The encoding, transmission, and decoding operations are generally resource intensive in terms of the processing (e.g., memory, CPU speed) and transmission requirements (e.g., communication link bandwidth) necessary to provide an adequate video presentation. However, such special systems are generally expensive to own and operate and therefore are not available to an average consumer.
Many commercial products including hardware and/or software components have become available to the average consumer for transmitting video information over public networks, such as the Internet. These systems may be, for example, coupled with a personal computer for use over the Internet or other communication network. For example, a video conferencing or video distribution system may be configured to transmit video information over the Internet among a group of PCs. However, due to the substantial resource requirements necessary for transmitting such information, and the limited and/or unreliable resources available on public networks, performance of such systems generally fall short of expectations, and such systems are rendered less-usable than more expensive specialized systems.
The quality of a combined audio-video communication as perceived by a user is highly correlated to both the overall latency of the communication and to the difference between the latency of the audio communication and the latency of the video communication between the sending and receiving systems.
For example, latency is commonly experienced during a typical cell phone call, in that during the conversation, there are delay periods due to latency that are periodically perceived by the user. In such a scenario, the user feels like the call is not in real time, and this latency affects the actual and perceived quality of the cell phone connection. With respect to video transmission, users experience latency as time lag or “jumpiness” between consecutive video frames. Such performance is present, for example, in conventional software video conferencing solutions.
Video quality may particularly suffer when transmitting data over communication networks such as the Internet. Due to bandwidth availability and latency, conventional video conferencing solutions generally provide frame rates between 2 and 10 frames per second. For instance, the system CUseeMe (available from First Virtual Communications, Redwood City, Calif.) allows video transmission between hosts using a video conference server provides a 2-3 frame per second video signal with a latency of 450 milliseconds between capture at one host and presentation at another host. Another Internet teleconferencing system, NetMeeting (available from the Microsoft Corporation, Redmond, Wash.) delivers video information at approximately 6-10 frames per second with a latency of approximately 230 milliseconds. Further, most of these conventional systems are unable to deliver an acceptable image size at a good quality. Conventional systems that operate over public networks such as the Internet are capable of delivering, at most, 320×240 pixel video data at 6-10 fps. By contrast, video delivered at approximately 24-30 frames per second (theatrical motion pictures are shown at 24 fps, while television displays at 30 fps) provides a perception to the viewer of full-motion video. Therefore, it would be beneficial to have a system that delivers a quality video signal without latency or jumpiness between successive frames.
Contributions to latency may include and are not limited to intermediate systems, network latency and latency due to processing. Some conventional systems use intermediate systems to handle video data transmitted between hosts. More specifically, video data from one host is transmitted to another host through an intermediate system. Intermediate systems may include various security and network components as well, including but not limited to firewalls, routers and others. The extra handling performed by these systems, which includes receiving, processing, buffering, and other steps at the intermediate system, adds latency to the transmission.
In some conventional systems, there is significant network latency due to one or more components of the network connection. This latency is due, for example, to additional network transmissions through intermediate systems as discussed above, and latency in creating and establishing network connections. Further, there are other problems, in addition to latency, with establishing connections through firewalls and other secure networking systems, as discussed below.
There are other contributing factors to latency at either or both of the receiving and transmitting hosts. For instance, latency is added at either the sending or receiving host due to over-handling of the video data. Excess buffering, thread-to-thread copying of video data, and other factors contribute to this type of latency.
Modern computer networks enable communication between computers in part by assigning each connected computer an address (for example, an Internet Protocol (IP) address), and one or more ports through which communication may proceed between the assigned IP addresses. Once a logical connection between computers is established, various techniques assure the authenticity of the ongoing information transfers, for example assigning sequence numbers to packets forming part of an ongoing connection. Some network interconnectivity features and security systems such as firewalls, network address translation (NAT) features, and others conventionally used, interfere with efficient, latency-free, real-time transmission of high bandwidth information, such as video information. What is needed, therefore, is an improved method for communicating video information.
When communication between a first host computer and a second host computer is desired, and the computers are connected to each other through a communications network without an intervening firewall or an intervening device performing NAT, either host may initiate the connection by simply sending a suitable message addressed to a suitable port for the message type at the address of the other host computer. Communication using a client/server architecture, or a peer-to-peer architecture, or any other suitable architecture can be initiated in this way, absent an intervening firewall or NAT device.
Communication without an intervening firewall or NAT device sometimes occurs when host computers are connected to a common local area network (LAN), such as a corporate network or home network. More commonly, when the host computers are interconnected through a wide area network (WAN), such as the Internet, and sometimes in LAN configuration, one or both host computers may be connected to the WAN through a firewall or NAT system. A firewall or NAT system has the effect of partially or completely masking, from computer systems reachable through the WAN, the host computers behind such firewall or NAT systems. This masking is performed by rejecting unexpected messages, i.e., messages sent to closed or incorrect addresses or ports or messages purporting to be part of an ongoing exchange, but having incorrect sequence numbers.
To make a conventional feature such as worldwide web browsing work, several operations occur. Connections from a local host computer to a remote server are initiated by the local computer. If the local computer is behind a firewall, the local computer initiates communication with a desired server, through the firewall. By initiating the communication through the firewall, the local computer instructs the firewall to allow certain types of communications back from the server for a certain period of time. When the server replies using a correct address, port and sequence number, the firewall recognizes the response as expected and passes the response on to the local computer. While the server may also be located behind a firewall, that firewall conventionally has a known port open to inbound traffic, so that certain types of contact with the server are permitted by the firewall.
To allow a direct peer-to-peer connection, a port is conventionally opened in the firewall of each participant that is connected to the network through a firewall, so that contact with any participant may be initiated by any other participant, however, this leaves each participant with at least one port of their firewall open and vulnerable to security breaches. In such applications as video and/or teleconferencing through the Internet, where direct, peer-to-peer connections are desirable for the purpose of minimizing latency, it is highly desirable to minimize the security risk to the computers of participants, while also allowing any participant to initiate a connection with minimal effort.
In some conventional systems, an event loop is used to process video information. This is generally in the form of a single thread (e.g., a thread of execution executed by a processor) that executes in an infinite loop. The thread waits until an event happens, and when an event occurs, the thread acts upon the event. Generally, only a single event can be processed at a time. Other threads can add events to the thread's workload, but the other threads cannot actually handle these events. This event loading causes the thread to become overloaded, and therefore, a particular event (e.g., a video transmission, encoding or decoding event) that needs to be processed is delayed. A simple example describing this issue is a worker scenario where a particular worker has multiple bosses, each of which generates work for the worker to perform. In performing a particular work task for one of the bosses, tasks requested of other bosses become delayed and must wait for the task currently being performed to be completed.