The present invention relates to a talk deciding system used in a voice talk, and the like and, more particularly, a talk deciding system for making exactly a talk decision such as a double talk, and the like.
In order to attend a talk session (communication conference) from the remote location, a talk session system equipped with a speaker and a microphone is spreading. In the talk session system, the voice picked up by the microphone is transmitted to a destination (far-end side) and the voice received from the far-end side is emitted from the speaker of own device (near-end side).
However, the talk session system is constructed to provide the speaker and the microphone in the same space. For this reason, when the voice received from the far-end side is emitted from the speaker, this voice is picked up by the microphone and sent out to the far-end side. As a result, a noise such as an echo, or the like is generated.
Therefore, as shown in Patent Literature 1, the talk session system having an echo canceller function has been proposed. The echo canceller of this system emits the voice received from the far-end side from the speaker and also inputs this voice into an adaptive filter. The adaptive filter filters the voice received from the far-end side by using a filter coefficient that estimates a transmission route from the speaker to the microphone, and generates an artificial regression voice. This echo canceller cancels the echo component by subtracting this artificial regression voice from the picked-up voice of the microphone.    [Patent Literature 1] JP-A-3-218150
However, the echo canceller in Patent Literature 1 could not completely cancel the echo component. In other words, the adaptive filter generates the artificial regression voice as described above, but this artificial regression voice is not perfectly identical to the voice coming from the speaker to the microphone and thus a component that is not perfectly cancelled still remains.
Therefore, when the talker on the far-end side is talking but the talker on the near-end side is not talking (referred to as a “far-end side single talk” hereinafter), it may be considered that a gain of the microphone on the near-end side should be suppressed. In this case, the accuracy of sensing the far-end side single talk becomes the problem. In other words, when the noise or the sudden sound is caused on the near-end side, it is often misunderstood that both the far-end side and the near-end side are talking (referred to as a “double talk” hereinafter) or only the near-end side is talking (referred to as a “near-end side single talk” hereinafter). As a result, the gain of the microphone on the near-end side was not suppressed and thus the echo could not be cancelled.