1. Field of the Invention
The present invention relates to an apparatus, a method, and a computer program product for speech processing.
2. Description of the Related Art
In recent years, development of a speech translation system or the like that supports an interlingual communication by translating a source language of an input voice into a target language is underway as one type of a speech processing apparatus that processes the input speech.
In the speech translation system, it is required to execute a speech processing for every speaker because speeches of a plurality of speakers are input to the system. There is a proposed technique for specifying a direction in which the speaker of each input speech is present and for deciding a translation direction by using a movable microphone or a gyrosensor as disclosed in, for example, JP-A 2005-141759 (KOKAI).
The technique disclosed in JP-A 2005-141759 (KOKAI) has, however, problems of occurrence of malfunction and complicated operation. This is because the technique is unable to perform the speech processing for every input sound if a surrounding noise or a nod such as a response from a counterpart to a speaker, which response should not be processed, is present.
With the technique disclosed in JP-A 2005-141759 (KOKAI), the speaker is switched over between an operator and a counterpart by moving a main body of the speech translation system or the microphone toward the operator or the counterpart. The switching operation is, however, disadvantageously required to be performed for every conversation and is possibly placed in the way of natural conversation. JP-A 2005-141759 (KOKAI) discloses the speaker-switching method using a microphone array; however, the problem of the possible undesired processing of unnecessary speech remains unsolved.
As the other method of determining the speaker, a technique for allowing a user to explicitly designate a speaker is disclosed in JP-A 2003-29589 (KOKAI). Specifically, the user turns on a switch when a user's speech is input, and turns off the switch to input a counterpart's speech. The technique disclosed in JP-A 2003-295892 (KOKAI) makes it possible to determine a translation language by one switch operation and can, therefore, improve operativity of the apparatus.
The method disclosed in JP-A 2003-295892 (KOKAI) has, however, the problem that an unnecessary voice is processed to cause possible occurrence of malfunction for the following reasons. With the method disclosed in JP-A 2003-295892 (KOKAI), a duration of a sound to be processed can be designated for the user's speech by turning on the switch. However, when the user turns off the switch, all input voices are processed. The problem results from absence of a method of appropriately setting a voice duration for the counterpart's speech.