A videoconference is a set of interactive telecommunication technologies which allow two or more locations to interact via two-way simultaneous video and audio transmissions. The core technology that is used in a videoconference system is digital compression of audio and video streams in real time. The other components of a videoconference system include: video input i.e. a video camera or webcam; video output i.e. a computer monitor, television or projector; audio input i.e. microphones; audio output i.e. usually loudspeakers associated with the display device or telephone; and, data transfer i.e. analog or digital telephone network, LAN or Internet.
In general, the videoconferencing market is divided loosely into two groups: those users that are willing to incur significant expense; and, those users that are not willing to incur expense. Some examples of users that are willing to incur significant expense include large and/or global corporations and public services, which are able to justify the expense on the basis of avoiding the cost and lost time that is associated with travel. The expense that is incurred comes from the cost of ownership or the cost of leasing a private network. Such a private network is managed, delivering a quality of service (QoS) often forming part of a Service Level Agreement (SLA).
The balance of the market, which includes those users without access to private networks, typically uses the Internet for data transmission. This group includes not only those users with no access to private networks, but also those users whose private networks do not provide QoS guarantees or do not connect to all endpoints to which the user may wish to connect. The Internet is an example of a best-effort network. Such a network differs from a managed-network in that transmission parameters of the best-effort network are subject to relatively large and variable transmission impairments, including jitter, delays, lost packets, etc., as a result of network congestion. Furthermore, these impairments typically are subject to sudden and significant changes in value, averaged over periods ranging from seconds to minutes or hours.
The transmission impairments that are associated with a best-effort network, such as the Internet, result in a typically uncomfortable experience for the user, due to the video component being “choppy,” of poor quality, and/or not precisely synchronized with the audio component of the communication. Rather than enhancing communication, the video component may actually provide false visual cues and even disorient or nauseate those that are party to the communication. For this reason, businesses and individuals have been slow to adopt IP-based videoconferencing despite the many advantages that are associated therewith. Of course, wider adoption is likely to occur when the video-component is improved sufficiently to provide more natural motion and a more life-like representation of the communicating parties. Accordingly, each incremental improvement in the encoding and/or transmission of video data is an important step toward achieving widespread adoption of videoconferencing technologies.
Unfortunately, current endpoint technology and transmission protocols produce a typically poor interactive experience. Using an existing protocol that deals with congestion, such as the Transport Control Protocol (TCP), the video transmission experiences potentially very large delays as a result of retransmission of lost packets, and significant reduction in transmission rate as a result of TCP's Additive Increase Multiplicative Decrease (AIMD) policy towards congestion. As a result, TCP is considered to be an inadequate protocol for transmission of live real-time video streams.
Alternatively, when using an existing protocol with no congestion control such as User Datagram Protocol (UDP), the user experiences severe packet loss in the event of congestion, which significantly reduces the quality of the videoconference experience since loss of compressed video packets results in significant visual artifacts in the decoded image. Continued congestion also significantly increases the delay, as a result of queuing delays, of video packets on the network that is experiencing congestion. As a result, UDP streams are considered to perform inadequately in the presence of network congestion.
Finally, when using the Datagram Congestion Control Protocol (DCCP), which provides congestion control for real-time applications such as audio and video, the video transmission is subject to potentially large buffering delays on the transmitter side in order to adhere to the rate control mechanism of DCCP. Unfortunately, delay is a key parameter in live videoconferencing applications since a long delay in receiving a response from a remote participant diminishes the illusion of a face-to-face conversation. Another problem with DCCP is that packets marked as DCCP are not necessarily routed by core Internet routers, since DCCP has not been widely adopted. Furthermore, DCCP does not address how video encoding parameters are changed in order to adhere to a given transmission rate.
It is also known to provide feedback signals from the recipient to the sender during streaming of audio-video content via a best effort network. These signals contain information relating to bandwidth throughput during a particular transmission interval. More particularly, the video that is being streamed is encoded into multiple quality segments or streamlets. Thus, when the bandwidth throughput does not match the bit rate of the streamlets being sent over the network, the sender stops sending some of the streamlets. Several steps of quality, such as low, medium, medium-high and high, are predefined prior to streaming the audio video content, and moving between different steps results in a noticeable differences in the quality of the video content. This approach is suitable for video-on-demand type applications, which tolerate buffering delays and require reliable packet delivery, but is not considered to be suitable for real-time videoconferencing applications.
It would be advantageous to provide a method and system that overcomes at least some of the above-mentioned limitations of the prior art.