1. Field of the Invention
This invention relates to multilingual translation apparatuses that translate a sentence, character strings, or sentences in an original language into those in a target language, and more particularly, to a technique for a translation memory that utilizes a TRIE structure.
2. Description of the Related Art
The machine translation techniques that translate the original language into the target language include the straight word-for-word direct translation technique, the analysis based translation technique and interlingua method, the statistics-based translation technique, the example sentence based translation technique, and the like.
On the straight word-for-word direct translation technique, the respective words that compose a sentence in the original language is directly translated into words in the target language, and the translated text is made in the target language according to the statistical data or predetermined rules.
On the analysis based translation technique and interlingua method, morphological analysis, syntactic analysis, and semantic analysis are implemented on the sentence in the original language to convert into the semantics, the syntax, the morpheme in the target language, and the translation sentence is composed in the target language. On the interlingua method, the sentence in the original language is analyzed and converted into the interlanguage, and the translated sentence is made in the target language with the converted interlanguage.
On the statistics-based translation technique, the original language is translated into the target language with the language model and the translation model. On the example sentence based translation technique, the input sentence in the original language is translated into the sentence in the target language while referring to the knowledge base developed by learning translated example sentences, as the process that the human learns a foreign language.
From among the above-mentioned translation techniques, there are several patent documents that disclose the interlingua method. Japanese Patent Application Publication No. 6-32508 (hereinafter referred to as Document 1), for example, provides an automatic translation system that can translate from one original language into two or more target languages simultaneously with a single interlanguage. This translation system makes it possible to automatically translate the document written in the original language into multiple target languages at high levels of the document analysis that enables a sophisticated understanding of the message, definite expression of the knowledge, and “translation quality/time”.
Japanese Patent Application Publication No. 62-251875 (hereinafter referred to as Document 2) describes an electronic translation apparatus that extracts the standardized interlanguage on the basis of the information related to the input original language, and generates the information related to the target language to correspond to the extracted standardized interlanguage on the basis of the extracted standardized interlanguage.
Japanese Patent Application Publication No. 5-290082 (hereinafter referred to as Document 3) provides a translation pattern for machine translation, with which the user can easily compose and efficiently retrieve. The sentence pattern is stored in a retrieval dictionary having a tree structure, and the input text sentence is checked with the retrieval dictionary. If successful, the corresponding sentence pattern in another language is obtained to make the text sentence in the target language with the corresponding sentence pattern. If not successful, translation is implemented with the language analysis and generation method of the machine translation technique.
Moreover, the techniques of the translation memory include the character index method and the word index method. With the character index method, the translation memory is realized by creating the character indexes for all the characters included in the bilingual corpus of translation pairs. With the word index method, the translation memory is realized by creating the character indexes for all the words included in the bilingual corpus of translation pairs.
It is to be noted that the conventional translation techniques have the following drawbacks. The straight word-for-word direct translation technique can be built in a relatively easy manner, yet at the same time, the translation accuracy cannot be assured. The interlingua method can work with the multilingual machine translation, yet the techniques of the syntactic analysis and semantic analysis are immature and the practical use is very difficult. Besides, a highly advanced language analysis and the generation technique of the interlanguage are indispensable for the interlingua method, and the interlingua method cannot be applied to the translation in many fields, it is difficult to enhance the function thereof, and it is also very difficult to maintain the translation tool.
The technique disclosed in Document 3 utilizes the tree structure to store the sentence patterns therein. If a part pattern is found at the time of checking the sentence pattern, the part pattern is replaced with one variable, enabling to enlarge an expression range of the pattern. However, if the word included in the sentence is not registered in the tree structure as the part pattern, there arises a problem in that the word cannot be associated even if there is a pattern corresponding to the sentence. In the tree structure shown in FIG. 7 of Document 3, the pattern of “improve” is associated in the sentence “improve the contact”. However, if “function” of a part pattern in the “improve the function” is not registered, “improve” cannot be associated. Further, even if the part pattern of “function” is registered in a lower tree of the tree structure, the pattern of “improve” cannot be associated. This causes another problem in that a number of sentence patterns are necessary for covering a wide range of expressions.
The translation memory that utilizes the character index method has a difficulty in real-time translation. The translation memory that utilizes the word index method cannot be applied to the multilingual translation system, in addition to the difficulty in real-time translation.