In a communication system a communication network is provided, which can link together two communication terminals so that the terminals can send information to each other in a call or other communication event. Information may include speech, text, images or video.
Modern communication systems are based on the transmission of digital signals. Analogue information such as speech captured by a microphone is input into an analogue to digital converter at the transmitter of one terminal and converted into a digital signal. The digital signal is then encoded and placed in data packets for transmission over a channel to the receiver of a destination terminal.
Background noise in the vicinity of the terminal in which the speech is input is transmitted together with the speech information in the digital signal. This results in the speech information output at the destination terminal being obscured by the noise transmitted with the signal. Also, the presence of noise in the signal interferes with the speech signal encoding, leading to audibly increased coding distortions or an increased transmission rate.
Attempts have been made to filter the signal to reduce the degree of noise input into the encoder at the transmitting terminal. In order to remove the noise from the signal input into the encoder a noise level estimation is required.
Low complexity noise level estimation used for terminals such as mobile devices typically smooth a frequency domain input signal using recursive low-pass filters or time-averaging to estimate the noise level.
An example of a low-pass filter is a 1st order auto-regressive filter as shown in Equation 1:y[n]=αy[n−1]+(1−α)x[n]  Equation (1)Wherein y[n] is the output for filtered element n, x[n] is the input for the filtered element n and α is the smoothing coefficient, with a value between 0 and 1. Increased smoothing is obtained by increasing the smoothing coefficient α.
A further example of a low-pass filter is a fast implementation of the same auto-regressive filter, as shown in Equation (2):y[n]=x[n]+α(y[n−1]−x[n]),  Equation (2)
Low complexity noise level estimation techniques have a low memory requirement and are well suited for devices with low computational power and a limited memory space.
However, one problem with using a low-pass filter to produce a noise level estimate is that when the incoming signal consists of both background noise and speech, the increase in the signal energy caused by periods of speech leads to a bias towards higher noise value estimates.
In the prior art methods, this problem is reduced by adjusting the noise level estimation when the presence of speech is detected. In prior art methods increased smoothing during the detected period of speech activity is used to account for the increase in signal energy due to presence of speech in the signal. However, speech presence detection is not always reliable for several reasons. When the speech detector has just recently been initialized, not enough history information may be present to reliably distinguish speech from noise. Also, speech and noise levels may be confused. This occurs particularly when the first few frames of speech have a low energy and are mistaken for background noise. Speech and noise levels may also be confused when noise and/or speech levels are changing over time. When speech is falsely detected as noise, a bias towards higher noise level estimates results. On the other hand, when noise is falsely detected as speech, the noise level estimator will not efficiently use the available information, resulting in less accurate estimates.
It is therefore an aim of the present invention to overcome the problems presented by the prior art. It is a further aim of the present invention to provide a method of improving the quality of the output signal without the use of complex computational methods that have large memory requirements.