1. Field of the Invention
The present invention relates to an echo suppressor (voice switch) provided for performing an acoustic echo removal and preventing howling in a communication device such as a standard telephone, a hands-free telephone, an interphone, a mobile telephone and a television conference system.
2. Description of the Related Art
A communication device having a function to perform hands-free communication by using a loudspeaker and a microphone instead of a handset has such advantages that both hands of a user become free and a many person conference become possible. However, the use of the loudspeaker and the microphone forms a go-around closed-loop via acoustic coupling, and if closed-loop gain exceeds 1, howling is usually generated and communication usually becomes difficult. Therefore, it is necessary to take any measure to maintain so that the closed-loop gain never exceeds 1 in order to secure stable communication by the communication device having the function to perform the hands-free communication.
An echo suppressor is conceivable as one of the measures. The echo suppressor monitors signal levels in each send path and receive path and suppresses the closed-loop gain so as not to exceed 1 by inserting a loss into a communication path in which it is determined that a signal does not include any speech. The echo suppressor computes in an easier way than that of an echo canceller and does not require a large capacity of a memory to store an adaptive filter coefficient, a status variation, etc., so that it has an advantage that it can be realized at a low cost.
It is necessary, however, for the echo suppressor to take the following points into account so as to establish effective control:
(1) If the acoustic echo of the received signal with the speech of a far-end user output from the loudspeaker included therein sneaks into the microphone and if the acoustic echo is superimposed onto the sending signal with the speech of a near-end user included therein, it is hard to distinguish each of the signals. This fact suggests that it is difficult to determine communication directions in an environment where the acoustic echo is prone to occur. Switching of the echo suppressor resulting from erroneous determination that the acoustic echo is the speech of the near-end user interrupts, in the middle of generation, the speech of the far-end user included in the received signal and degrades communication quality.
(2) If it is determined that the received signal does not include any speech therein, it is absolutely essential to switch a communication direction from a receiving direction to a sending direction as soon as possible so as to secure a simultaneous conversation property. However, if the echo suppressor is switched at high speed, end of a word in the speech of the far-end user included in the received signal is interrupted. This phenomenon is called receiving blocking. The acoustic echo is transmitted on a far-end by the switching at a high speed and the user on the far-end feels about the acoustic echo. If the echo suppressor on the far-end determines that the acoustic echo is the speech of the near-end user and switches the communication direction from the sending direction to the receiving direction, subsequently input speeches of the far-end user are interrupted by the echo suppressor on the far-end. This phenomenon is called sending blocking.
Conventionally, a variety of arts have been proposed to prevent occurrences of the receiving and sending blocking:
(1) An art to detect voice activity in an input signal to count the length of a state (active voice state or non-active voice state) by a counter and determine that the state is the active voice state or the non-active voice state when the counted time lasts for not less than a predetermined time period (Jpn. Pat. No. 3,466,050).
(2) An art to provide a provisional state between a sending state and a receiving state (The Institute of Electronics, Information and Communication Engineers Transactions, Vol. J80A No. 4 pp. 587-596, April, 1997).
(3) An art to adaptively change a threshold in response to a maximum value and a difference between the maximum value and a minimum value of input signal power and distinguish between a speech section and a non-speech section of the input signal on the basis of the threshold (Jpn. Pat. No. 3,160,228).
(4) An art to calculate a threshold for speech section detection by taking a signal-to-noise ratio into account (Jpn. Pat. Appln. KOKAI Publication No. 2002-64,618).
In the prior art (1), however, the receiving state or the sending state lasts for a specified time period regardless of the power of the speech included in the input signal. Accordingly, in a device having a user volume controller, if a user performs a changing operation of the volume controller, a communication state becomes unnatural and a communication quality is deteriorated.
The art in (2) suppresses a background noise on the far-end to not more than a background noise level on the near-end; therefore, this art is apt to be affected by fluctuation in the background noise level and become unstable in operations. This fact becomes problematic specifically in a mobile communication terminal frequently used in an environment tending to vary in the background noise.
The device having the user volume controller is prone to overflow depending on a volume varying operation by the user. The prior art (3), however, is weak in overflow of signal power and a threshold to be a reference is not varied even when the user operates the volume controller, so that the speech section detection tends to become unstable.
The prior art (4) varies the signal-to-noise ratio when the signal power overflowed; therefore, the threshold for the speech detection has a not appropriate value. A calculation amount becomes large resulting from taking the signal-to-noise ratio into account.