1. Field of the Invention
The present invention relates to voice signal processing, and more particularly to method and system for eliminating noises including echo mixed in voice signal via adaptive filters.
2. Description of Related Art
Both noise and echo eliminations are important problems in the subject of signal processing. The research on these problems started when the signal processing came into being. It is still one of the research hot spots in signal processing. During the voice transmission and processing, various noises and echo are mainly caused by these reasons. First, there is a noise source adjacent to a voice source. If a pair of microphones A and B is adapted to record the voice source, wherein the microphone A is near to the voice source and far from the noise source, and while the microphone B is near to the noise source and far from the voice source. In this condition, a double track voice signal could be recorded. However, both the microphones A and B simultaneously produce noises due to the existence of the noise source, which adversely affects the tone of the double track voice signal. It should be noted that the two adjacent voice source and noise source are taken as the example. In reality, various voice sources and noise sources coexist, in which the influence of the noise sources on the voice will become more serious.
The echo is a problem that is often experienced by a real-time teleconference. In the real-time teleconference, a local voice at location A is received by a remote receiver after a certain period delay, and then played by a remote speaker as an echo of the local voice A. At the same time, a remote voice at location B tries to make conversations with location A, the remote voice at location B together with the echo of the voice from location A are recorded at the same time and transmitted to location A. Thus when the voice from location B is reproduced, in addition to the remote voice from location B, the original local voice from location A is also heard. Such a phenomenon is called echo phenomenon which also adversely affects the tone of the voice signal.
It is important to reduce or eliminate the noise and the echo in voice processing. Usually the noise and echo in voice signals can be reduced or eliminated by adaptive filters. FIG. 1 shows an example of filtering out noise and echo. It is assumed that the voice mixed with the noise which is recorded by microphone A is transmitted via Channel A and a reference noise which is recorded by the microphone B is transmitted via Channel B. In accordance with FIG. 1, an adaptive filter 11 estimates the noise mixed in the voice according to the reference noise in the Channel B. Then, by a subtractor 12, the noise estimated by the adaptive filter 11 is subtracted from the voice with mixed noise in Channel A to obtain a clean voice. Finally, the clean voice is provided to the adaptive filter 11 as a feedback signal. Thus, it can improve a signal-noise ratio of the voice mixed with the noise.
FIG. 2 shows another example of filtering out noise and echo. It is assumed that a speaker A's voice is mixed with an echo of a speaker B's voice. In accordance with FIG. 2, an adaptive filter 21 estimates a possible echo in the speaker A's voice according to the speaker B′ voice. A subtractor 22 is provided to subtract the echo estimated by the adaptive filter 21 from the speaker A's voice mixed with the echo to get a clean voice signal. Finally, the clean voice is provided to the adaptive filter 21 as the feedback signal, thus it can reduce or eliminate the echo influence on the real-time teleconference.
The conventional adaptive filter is typically implemented using Least Mean Square algorithm (LMS) and Recursive Least Square algorithm (RLS). When the adaptive filter is of N order, the computation quantity of the LMS algorithm is O(N) (i.e., a theoretical measure of the execution of an algorithm in the art), while the computation quantity of the RLS algorithm is O(N*N). The computation quantity and memory requirements of the LMS algorithm are very small, which leads it to become a very popular algorithm in DSP. The frequently used algorithm among the LMS algorithms is the NLMS algorithm (Normalized Least Mean Square) which includes time domain NLMS sub-band NLMS and frequency domain NLMS.
As shown in FIG. 3, the LMS algorithm has two inputs, Signal X and Signal Y. The Signal X is a reference noise and/or a remote speaker's voice signal, and Signal Y is the voice signal with the noise and/or the echo. The LMS algorithm is summarized as follows:
                              E          ⁡                      [            n            ]                          =                              Y            ⁡                          [              n              ]                                -                                    ∑                              i                =                0                                            N                -                1                                      ⁢                                          w                i                            *                              X                ⁡                                  [                                      n                    -                    i                                    ]                                                                                        (        1        )            where the E[n] indicates the factual output signal vector at n time of day, Y[n] indicates the voice with the noise or echo vector at n time of day, X [n−i] indicates the noise or echo vector at n−1 time of day, Wi indicates the i order coefficient vector of the adaptive filter, and N indicates the order number of the adaptive filter.
However, it is noticed that there are several problems in the conventional methods for eliminating noise and echo mixed in voice signal via adaptive filters. Only one adaptive filter and one reference sound source can not simultaneously resolve the problem of noise elimination and echo elimination. In order to simultaneously eliminate the noise and echo, two adaptive filters and two reference sound sources are needed. Additionally, the order number of the echo elimination filter is higher than the order number of the noise elimination filter.
If connecting the adaptive filters in serial and putting the echo elimination filter before the noise elimination filter, which may lead to echo existence in the reference noise, the ultimate output will encompass the echoing similar to the reverberation.
Thus there is a need for techniques for simultaneously and efficiently eliminating echo and noise mixed in voice signals.