The invention relates to noise reduction and voice activity detection in speech communication systems.
The presence of background noise in a speech communication system affects its perceived grade of service in a number of ways. For example, significant levels of noise can reduce intelligibility, cause listener fatigue, and degrade performance of the speech compression algorithm used in the system.
Reduction of background noise levels can mitigate such problems and enhance overall performance of the speech communication system. In the highly competitive area of communications, improved voice quality is becoming an increasingly important concern to customers when making purchasing decisions. Since noise reduction can be an important element for overall improved voice quality, noise reduction can have a critical impact on these decisions.
Voice encoding and decoding devices (hereinafter referred to as xe2x80x9ccodecsxe2x80x9d) are used to encode speech for more efficient use of bandwidth during transmission. For example, a code excited linear prediction (CELP) codec is a stochastic encoder which analyzes a speech signal and models excitation frames therein using vectors selected from a codebook. The vectors or other parameters can be transmitted. These parameters can then be decoded to produce synthesized speech. CELP is particularly useful for digital communication systems wherein speech quality, data rate and cost are significant issues.
A need exists for a noise reduction algorithm which can enhance the performance of a codec. Noise reduction algorithms often use a noise estimate. Since estimation of noise is performed during input signal segments containing no speech, reliable noise estimation is important for noise reduction. Accordingly, a need also exists for a reliable and robust voice activity detector.
In accordance with an aspect of the present invention, a noise reduction algorithm is provided to overcome a number of disadvantages of a number of existing speech communication systems such as reduced intelligibility, listener fatigue and degraded compression algorithm performance.
In accordance with another aspect of the present invention, a noise reduction algorithm employs spectral amplitude enhancement. Processes such as spectral subtraction, multiplication of noisy speech via an adaptive gain, spectral noise subtraction, spectral power subtraction, or an approximated Wiener filter, however, can also be used.
In accordance with another aspect of the present invention, noise estimation in the noise reduction algorithm is facilitated by the use of information generated by a voice activity detector which indicates when a frame comprises noise. An improved voice activity detector is provided in accordance with an aspect of the present invention which is reliable and robust in determining the presence of speech or noise in the frames of an input signal.
In accordance with yet another aspect of the present invention, wherein gain for the noise reduction algorithm is determined using a smoothed noise spectral estimate and smoothed input noisy speech spectra. Smoothing is performed using critical bands comprising frequency bands corresponding to the human auditory system.
In accordance with still yet another aspect of the present invention, the noise reduction algorithm can be either integrated in or used with a codec. A codec is provided having voice activity detection and noise reduction functions integrated therein. Noise reduction can coexist with a codec in a pre-compression or post-compression configuration.
In accordance with another aspect of the present invention, background noise in the encoded signal is reduced via swirl reduction techniques such as identifying spectral outlier segments in an encoded signal and replacing line spectral frequencies therein with weighted average line spectral frequencies. An upper limit can also be placed on the adaptive codebook gain employed by the encoder for those segments identified as being spectral outlier segments. A constant C and a lower limit K are selected for use with the gain function to control the amount of noise reduction and spectral distortion introduced in cases of low signal to noise ratio.
In accordance with another aspect of the present invention, a voice activity detector is provided to facilitate estimation of noise in a system and therefore a noise reduction algorithm using estimated noise such as to determine a gain function.
In accordance with yet another aspect of the present invention, the voice activity detector determines pitch lag and performs periodicity detection using enhanced speech which has been processed to reduce noise therein.
In accordance with still yet another aspect of the present invention, the voice activity detector subjects input speech to automatic gain control.
In accordance with an aspect of the present invention, a voice activity detector generates short-term and long-term voice activity flags for consideration in detecting voice activity.
In accordance with yet another aspect of the present invention, a noise flag is generated using an output from a voice activity detector and is provided as an input to the noise reduction algorithm.
In accordance with another aspect of the present invention, an integrated coder is provided with noise reduction algorithm via either a post-compression or a pre-compression scheme.