1. Field of the Invention
The present invention relates to a device and method capable of providing a result of real-time automatic interpretation of real-time continuous speech in a situation in which the real-time continuous speech occurs, and more particularly, to a device and method of simultaneous interpretation based on real-time extraction of an interpretation unit capable of providing a result of real-time automatic interpretation of a normal spoken clause and even of a spoken sentence that is a normal spoken clause but is too long, a series of spoken sentences each of which is normal but is too short to correctly be conventionally translated, and a fragment of a sentence that is not a normal sentence, depending on characteristics of real-time speech.
2. Discussion of Related Art
Most automatic translation and automatic interpretation devices being released nowadays assume a sentence as a unit of interpretation/translation, and thus, a basic unit of input speech is a sentence.
According to circumstances, when several sentences are input, translation is performed for each sentence unit after the sentences are broken into sentence units according to simple rules for segmenting sentences.
Consequently, conventional devices aim to faithfully provide accurate translation results for each sentence unit. In most cases, a corresponding sentence unit can faithfully be automatically translated by performing analysis of only the corresponding sentence unit and generating high-quality bilingual text thereof.
In the case of the automatic interpretation and translation devices that perform interpretation or translation for each sentence unit, because users of the devices are aware of automatic interpretation/translation environment, the users speak in a manner that is suitable for automatic interpretation and translation and communicate through the devices such that communication is conducted with sentence units as units of speech.
However, when automatic interpretation/translation is attempted to be performed for real-time continuous speech such as a phone conversation, a lecture, or a presentation, a conventional assumption that a unit of input speech is a sentence often does not make sense.
In the case of the conventional automatic interpretation and translation devices mentioned above, a finish button or pause information is used to finish an input of text. When the finish button or a pause of a predetermined length or longer is generated, it is considered that an input of sentences or speech is finished, and corresponding speech or sentences are considered as sentences to be translated.
However, when interpretation/translation is performed for real-time speech, the finish button cannot be used, and a pause, which is a phonetic feature, is still used as a standard for determining a sentence unit.
As described above, when a pause is used as a standard for determining a translation unit, corresponding speech itself is often not a sentence unit. For example, in some cases, quite long speech consisting of several sentences is spoken in one breath, a single sentence is spoken with multiple breaths, speech is not finished with a sentence, or meaningless interjections are frequently made. In these cases, due to characteristics thereof, a correct translation result cannot be generated using a conventional automatic translation methodology in which translation is performed for each sentence unit.