In the capture signal processing chain of a real-time communications system, it is advantageous to have the echo control (EC) component process a signal as early in the processing path as possible. Doing so minimizes any distortion between the components receiving the far-end stream bound for rendering and the components of the corresponding near-end stream from capturing.
An example of a conventional approach to arranging audio components in a signal path is shown in FIG. 1. As illustrated, near-end components may include a capture device (e.g., a microphone), an echo control (EC) component, and a noise suppression (NS) component, along with a render device (e.g., a loudspeaker) located at the far end. In using such a conventional approach, the EC component processes a signal input from the capture device before the signal is processed by the NS component. However, a side-effect of the suppression stage in EC is the suppression of noise as well as echo. To maintain as consistent a noise level as possible, suppressed noise is replaced with estimated “comfort noise” during EC processing. Often the comfort noise will not exactly match the background noise level, and in some cases deviates considerably. This deviation creates a change in noise level known as noise “pumping”.
Normally, comfort noise algorithms are designed to err on the side of comfort noise being lower than the true noise level. If a signal processed through EC is subsequently processed through noise suppression (NS), the noise pumping effect may be amplified. The NS first analyzes the signal to obtain an estimate of the noise level. When an echo segment arrives, the NS adapts to this (typically) lower comfort noise level, and lowers its suppression level as a result. As the echo segment ends, the arriving noise level returns to its true level. Although the NS begins to adapt accordingly, it generally takes some time for the NS to converge on a good estimate. During this period of NS adjustment the actual noise is insufficiently suppressed, and as a result, the perceptual effect of the noise pumping increases.
An alternative approach might simply place the NS component in front of the EC component. However, such an approach introduces possible distortion between near-end and far-end components.