A multipoint audio conference call accommodates signals with large variety of bandwidth (BW) and/or signal distortions (SD).
A multipoint control or conference unit (MCU) may be utilized to manage the multipoint conference which can include both audio and video components. The MCU facilitates conferencing between three or more participants. A MCU mixes the audio (and/or video, data) streams from each participant (the participant may be associated with a client terminal) and transmits a single audio (and/or video, data) stream back to each participant.
A typical MCU 100 such as that illustrated in FIG. 1 includes input/output ports 110 for receiving audio streams and for transmitting mixed audio streams. MCU 100 also includes a memory 120 for storing received audio streams and the mixed audio streams that are to be transmitted. MCU 100 further includes a controller or processor 130 for performing the various functions of the MCU such as mixing the audio streams, etc. MCU 100 can be connected to the participant via a network such as the Internet for example.
A simplified description of an MCU has been provided for purely illustrative purposes, as an MCU is well known. Various components of an MCU are included without specific connection between these components being illustrated.
A multipoint conferencing arrangement 200 is illustrated in FIG. 2. MCU arrangement 200 includes at least three client terminals 210, 220 and 230 each communicating with MCU 260 via a network 250. Each client terminal (210, 220 and 230) sends unicast audio (and video) streams to MCU 260 via respective communication channels 215, 225 and 235. Each of the communication channels 215, 225 and 235 corresponds to a particular one of the client terminals 210, 220 and 230. That is, communication channel 215 provides communication between MCU 260 and client terminal 210; communication channel 225 provides communication between MCU 260 and client terminal 220; communication channel 235 provides communication between MCU 260 and client terminal 230. MCU 260 mixes the audio (and video) streams from each client terminal and transmits a single audio (and video) stream back to each client terminal via respective communication channels 215, 225 and 235.
Audio bandwidth is a range of audio frequencies that affect the sound quality (i.e. the degree to which sound is accurately reproduced). Audio bandwidth is the set of non-zero frequency components of an audio signal. The transmitted audio signal can be of varying bandwidths due to the different coding systems or bit constraints that are used.
Typical bandwidths determined by the coding system are 3.5 kHz, 7 kHz or above 14 kHz which correspond to narrowband, wideband and super wideband respectively. Some factors that affect the signal distortion level include: limited bit budget, noise suppressors, echo cancellers, automatic gain control, etc.
Quality optimization is difficult to achieve when dealing with multiple audio channels. These challenges are even more pronounced when control bandwidth extension (BWE) schemes based on bandwidth and signal distortions of the multiple audio channels are employed.
Bandwidth extension is a method for converting a speech signal (at a low frequency range such as 8 kHz for example) to a high quality speech signal (at a higher frequency range such as 12 kHz for example). It is often used to convert a telephone quality speech signal to a high quality wideband speech signal. The purpose of BWE is to reconstruct signal frequency components, which are not encoded/transmitted, and therefore not available at the receiver.
BWE can be placed at the decoder side of an audio coding system, or in the network. BWE systems increase signal bandwidth but also cause some signal degradation. Therefore, it would be both beneficial and desirable to estimate the expected gain of bandwidth extension before applying it (i.e. the BWE) to a signal.
In a multipoint audio conference, the bandwidths (BW) of audio channels are typically optimized independently. That is, the bandwidth and signal distortion of other channels are not taken into consideration. This approach leads to sub-optimal solutions as the results from listening tests indicate that the perception of signal bandwidth depends on the level of signal distortion and bandwidth of surrounding channels.