Recently, a machine translation system exists for automatically translating a sentence of a first language (original language) to a sentence of a second language (target language). In a translation processing of the machine translation system, first, an input original sentence is divided into a predetermined processing unit such as a word (a phrase) by morphological analysis or sentence structure analysis. Next, an applicable translation rule and a corresponding translation word (translation phrase) are determined for each processing unit by retrieving a translation dictionary. Then, each translation word is combined by a predetermined rule to generate a translation sentence. In this way, the translation sentence corresponding to the input original sentence is obtained.
In order to realize a high accuracy translation in this machine translation, use of a dictionary suitable for the translation is important. In order to cope with translation of various original sentences, the dictionary generally has a plurality of translation word candidates for each original word of the same language. Accordingly, a user must select one translation word suitable for the user's intention (a liking, an area or a purpose) from the plurality of translation word candidates.
If the user's selection operation is learned by the machine translation system, hereafter, the translation words selection can be suitable for the user's intention. In the case that a translation word unsuitable for the user's intention is generated, a learning operation of translation word is executed. As the learning operation, the user selects his/her suitable translation word from other translation candidates again, and indicates the selected translation word to the system. By executing the learning operation, hereafter, this translation word is preferentially selected.
In this way, a function to select a translation word suitable for the user's intention by the learning operation is called “translation word learning”. For example, “Konpyuta” “Konpyutah” “Keisanki” exist as translation word candidates (Japanese) for “computer” (English). One translation word to be selected from these translation word candidates is determined based on the user's liking, the field, and the use purpose.
In a machine translation system of the prior art, the translation word learning is realized by the user's learning operation. Concretely, in the case that a plurality of translation word candidates for the same original word exist, the plurality of translation word candidates are presented to the user, and the user selects one translation word from the plurality of translation word candidates. In response to the user's selection, this translation word corresponding to the original word is stored in the system. Hereafter, in the case of translating the original word, the system preferentially selects the stored translation word. This translation word learning is described in Japanese Patent Disclosure (Kokai) PH9-81572 “Translation device and dictionary priority setting method” and Japanese Patent Disclosure (Kokai) PH8-101836 “Learning method for machine translation”. In this method, if a large number of unsuitable translation words is first selected by the system, the number of times of the user's learning operation is also large. As a result, a large burden is laid on the user.
Accordingly, as a translation word learning method unnecessary for the user's troublesome operation, the translation word is automatically determined by statistic information of a target language document such as a corpus. In this translation word learning method, the user previously prepares the target language document suitable for the user's intention, and the translation word learning suitable for the user's intention can be automatically executed. Concretely, appearance frequency of each word in the target language document is previously counted, and each word with the appearance frequency is stored in a table. In the case that a plurality of translation word candidates is generated for the same original word, one candidate of the highest appearance frequency in the plurality of translation word candidates is selected by referring to the table. This method is described in “Translation word learning method using a single language corpus of a target language” (Proceedings of the 8th Annual Meeting of the Association for Computational linguistics, 2002 Vol. 1, pp 276-280) and Japanese Patent Disclosure (Kokai) P2000-250914 “Machine translation method and device and recording medium recording machine translation program”.
However, as mentioned-above, in this automatic translation word learning method, a translation word is determined by using one document of the target language. Accordingly, an unsuitable translation word for some original word is often selected. In this case, if the translation word learning is executed by using another document of the target language, a suitable translation word for this original word may be selected. However, even if a plurality of target language documents is previously prepared, it is difficult for the user to select one target language document by which a suitable translation word is determined. For example, in the case that the user prepares a plurality of target language documents each of which contents are similar, if the user does not sufficiently understand the contents of each target language documents, he/she cannot select one useful target language document.
Briefly, in the case that a plurality of target language documents is prepared, even if the user indicates one target language document suitable for his/her intention, it sometimes happens that an unsuitable translation word is automatically output by using the indicated target language document. Accordingly, a method to usually select only suitable translation words by using the target language document is desired.