FIG. 1 is a block diagram showing the configuration of an echo suppressing apparatus of a first example of related art.
FIG. 1 shows an exemplary configuration of an echo suppressing apparatus for suppressing an echo generated in a hands-free phone.
In FIG. 1, an audio signal from the far-end speaker (hereinafter referred to as far-end signal) inputted to input terminal 10 is converted into far-end audio by loudspeaker 2. On the other hand, microphone 1 picks up, for example, the voice of the near-end speaker (hereinafter referred to as near-end audio) and also receives unnecessary far-end audio produced by loudspeaker 2. The sound inputted from loudspeaker 2 to microphone 1 is called an echo. The sound transfer system that handles sound-related signals, ranging from the far-end signal to the output signal of microphone 1, is called an echo path. The sound transfer system includes loudspeaker 2 and microphone 1.
Only the near-end audio is desired to be outputted as the near-end signal from output terminal 9, and the unnecessary far-end audio contained in the near-end signal is desired to be removed. In particular, when the near-end signal contains a large far-end audio signal component, delayed far-end audio is audible as an echo to the far-end speaker, so that it becomes difficult to have a conversation. To address such a problem, in a method employed in related art, a linear echo canceller is used to remove the echo from the near-end signal. A linear echo canceller is described, for example, in non-patent document 1 (Eberhard HANSLER, “The hands-free telephone problem: an annotated bibliography update,” annals of telecommunications 1994, pp. 360-367).
Linear echo canceller 3 estimates the transfer function of the echo path (echo path estimation), and uses the signal inputted to loudspeaker 2 (far-end signal) to produce a simulated signal (echo replica signal) of the echo inputted to microphone 1 based on the estimated transfer function.
The echo replica signal produced in linear echo canceller 3 is inputted to subtractor 4, which subtracts the echo replica signal from the output signal of microphone 1 to extract the near-end audio signal component.
Speech detector 5 receives the output signal of microphone 1, the output signal of linear echo canceller 3, the output signal of subtractor 4, and the far-end signal, uses these signals to detect whether or not the output signal of microphone 1 contains any near-end audio, and outputs the detection result to linear echo canceller 3.
To control the operation of linear echo canceller 3, speech detector 5 outputs “zero” or a very small value as the speech detection result when speech detector 5 has detected any near-end audio in the output signal of microphone 1, while outputting a large value when speech detector 5 has detected no near-end audio.
FIG. 2 is a block diagram showing an exemplary configuration of the linear echo canceller shown in FIG. 1.
As shown in FIG. 2, linear echo canceller 3 includes adaptive filter 30, which is a linear filter, and multiplier 35. Examples of adaptive filter 30 include filters of various types, such as an FIR type, an IIR type, and a lattice type.
Adaptive filter 30 filters the far-end signal inputted to terminal 31 and outputs the processed result from terminal 32 to subtractor 4. Adaptive filter 30 uses predetermined correlation operation to update a filter coefficient in such a way that the output signal of subtractor 4 inputted to terminal 33 is minimized. To this end, adaptive filter 30 operates in such a way that the component in the output signal of subtractor 4 that correlates with the far-end signal is minimized. That is, the echo (far-end audio) will be removed from the output signal of subtractor 4.
When the output signal of microphone 1 contains near-end audio and the filter coefficient is updated in such a state, the resultant change in the filter coefficient may reduce the echo removal capability of adaptive filter 30.
Multiplier 35 is provided to control the filter coefficient update operation performed by adaptive filter 30. Multiplier 35 multiplies the output signal of subtractor 4 by the output signal of speech detector 5 and outputs the computation result to adaptive filter 30. When the output signal of microphone 1 contains near-end audio, the output signal of speech detector 5 is either “zero” or a very small value as described above, so that the filter coefficient update operation performed by adaptive filter 30 is suppressed and hence the change in the filter coefficient is small. As a result, the echo removal capability is not greatly degraded.
The echo suppressing apparatus of the first example of related art thus uses the adaptive filter to remove the echo of the far-end signal.
Next, an echo suppressing apparatus of a second example of related art will be described.
The echo suppressing apparatus of the second example of related art modifies a pseudo echo (echo replica signal), which is used to suppress an echo, according to the angle of a hinge in a folding-type mobile phone. Such a configuration is described, for example, in Japanese Patent Laid-Open No. 8-9005.
The echo suppressing apparatus of the second example of related art includes a control signal generator that detects the angle of the hinge and outputs a control signal according to the angle, and an echo controller that suppresses an echo based on the control signal.
The echo controller includes a coefficient selection circuit that holds a plurality of preset echo path tracking coefficients to produce a pseudo echo corresponding to the echo path that varies according to the angle of the hinge and that uses the control signal outputted from the control signal generator as an address signal to select an echo path tracking coefficient; an adaptive control circuit that outputs a pseudo echo modification signal to modify the pseudo echo based on the echo path tracking coefficient selected in the coefficient selection circuit; a pseudo echo generation circuit that generates the pseudo echo based on the pseudo echo modification signal; and a subtraction circuit that subtracts the produced pseudo echo from the output signal of an audio input unit (microphone).
Next, an echo suppressing apparatus of a third example of related art will be described.
The echo suppressing apparatus of the third example of related art is based on the technology described, for example, in Japanese Patent Laid-Open No. 2004-056453. The echo suppressing apparatus of the third example of related art uses either the output signal of a microphone (sound pickup device) or the signal obtained by subtracting the output signal of an echo canceller from the output signal of the sound pickup device as a first signal, and uses the output signal of the echo canceller as a second signal. Then, the echo suppressing apparatus estimates the amount of crosstalk of the second signal (far-end signal, echo) that leaks into the first signal (near-end signal), and corrects the first signal based on the estimation result.
The estimated value of the amount of echo crosstalk is the ratio of the amount according to the amplitude or power of the second signal during the period in which no near-end audio is detected to the amount according to the amplitude or power of the first signal. In the echo suppressing apparatus of the third example of related art, for each frequency component in the first and second signals, the first and second signals are used to calculate the amount of estimated echo crosstalk, and the first signal is corrected based on the estimated value that has been calculated.
The echo suppressing apparatuses of the first and second examples of related art described above can sufficiently suppress an echo when nonlinear elements, such as distortion generated in the echo path, are small. However, in an actual apparatus, a loudspeaker, for example, has a large nonlinear element. The transfer function of an echo path containing distortion is nonlinear, so that linear echo canceller 3 cannot simulate an accurate transfer function of the echo path. In particular, when a small-sized loudspeaker used in a mobile phone or the like produces sound at high-volume levels, a large amount of distortion contained in the sound limits the suppression of the echo to approximately 20 dB. In this case, the echo is transmitted as the near-end signal and is audible to the far-end speaker, so that it becomes difficult to have a conversation.
In contrast, in the third example of related art, an echo is sufficiently suppressed even when the echo path generates a large amount of distortion. The echo suppressing apparatus of the third example of related art, however, requires a large amount of computation because of a complicated process for estimating the amount of echo crosstalk. In particular, it requires a large amount of division operation. Further, since the echo suppressing apparatus of the third example of related art uses the speech detection result indicative of whether or not the output signal of the microphone contains near-end audio, a wrong speech detection result increases the error in the amount of estimated echo crosstalk, so that the corrected first signal, which is corrected based on the estimation result, will be degraded. That is, the echo will not be sufficiently suppressed, or the near-end audio will contain a large amount of distortion. In particular, when the echo suppressing apparatus is used in an environment in which near-end audio along with high-level noise (near-end noise) is inputted, the error in the speech detection result likely increases, so that the echo will not be sufficiently suppressed, or the near-end audio will contain a large amount of distortion.