A simultaneous translation apparatus that has been developed in recent years generates text data by recognizing input speech, translates the text data, and sequentially outputs synthesized speech and a text image, whereby the input speech is translated and output in real time.
For a simultaneous translation apparatus, a shorter time required between speech input and translation result output (a higher level of simultaneousness) is more preferable. Also, not only for the simultaneous translation apparatus, but also for other types of translation apparatuses, a shorter time required between text data input and translation result output is more preferable.
For this purpose, consecutively translating input text data on a per-phrase basis (hereinbelow, a “phrase” refers to a word or a group of a plurality of words) enables shortening of the time required for outputting the translation result. However, in this case, as the context of a phrase is ignored in the translation, the translation accuracy decreases.
On the other hand, when translation of input text data is initiated upon confirmation of the end of the sentence, the translation accuracy can be increased. However, in this case, the time required for outputting the translation result increases. In particular, during a lecture or the like, when text data generated by collecting and recognizing the speech of the lecturer is translated simultaneously, one sentence is long and the end of the sentence is not clear. In this translation method, therefore, the time required for outputting the translation result increases.
Patent Document 1 proposes a text data segmentation apparatus to solve this problem. The text data segmentation apparatus segments text data between a pre-translated phrase and its following phrase when the chance that the word order of the aforementioned phrases is maintained (word order is not reversed) before and after translation is a certain percentage or higher. This text data segmentation apparatus can segment text data at a location where the word order does not change after translation. Accordingly, only by sequentially translating text data segmented by this text data segmentation apparatus, the time required for outputting the translation result can be decreased and the translation accuracy can be increased.