Video conferencing technology enables the users of two or more people at geographically remote locations to have audiovisual communication with each other. Video conferencing is currently possible using a conventional personal computer (PC) equipped with video conferencing software, a video camber, and a connection to a high-speed data link. One video conferencing system which permits multi-point video conferencing using conventional PCs is the ProShare.TM. Personal Conferencing Video System, which is available from Intel Corporation of Santa Clara, Calif.
A problem associated with almost any communication system is noise. The problem of noise is especially significant in multi-point conferences (conferences between three or more participants), because the overall amount of noise introduced into the system increases as the number of participants increases. In particular, noise in the audio channel can degrade the quality of the transmitted audio signal as well as cause annoyance and ear fatigue to the user. In certain video conferencing systems, audio from several local endpoints (e.g., participating PCs) can be combined into a single audio stream that is transmitted to other, remote endpoints. If the user of any of the local endpoints is not speaking, then those endpoints are introducing unnecessary noise into the audio stream.
Certain disadvantages are associated with some existing solutions to the noise problem. For example, one approach is to first set a threshold volume level, and to then suppress all audio which falls below the threshold level and transmit all audio that exceeds the threshold level. This approach has been referred to as audio gating. The problem with this approach is that audio gating is generally perceivable to the listener as unnaturally abrupt transitions between sound and silence as the speaker speaks. Often, speech passages are partially cut off, such as when a participant is speaking very quietly, or such as in the case of "unvoiced" speech (i.e., sounds that involve no vocal chord movement). In addition, the ambient noise level at any given endpoint may vary significantly during a communication session. However, certain audio gating solutions do not adapt to such changes in the noise level. Some solutions, such as certain noise cancellation techniques, are computationally complex and therefore tend to slow down processing in a local endpoint. As a result, such solutions are not well suited to the mixing of multiple audio streams. Noise cancellation techniques also tend to cause distortion of the speaker's voice.
Therefore, it would be desirable to have a noise suppression solution which improves the overall quality of transmitted audio and which reduces ear fatigue. In particular, it would be desirable to have a noise suppression solution for a video conferencing system which reduces perceivable gating effects and which dynamically adapts to the ambient noise level. It is further desirable that such a solution reduce the processing burden on a microprocessor and reduce distortion of a speaker's voice.