1. Field of the Invention
The present invention generally relates to a translation system for which high speed processing is required, and in particular to a translation method and system for improving the accuracy of the selection of an appropriate word in machine translation, without incurring any deterioration of the processing efficiency.
2. Description of the Related Art
As a consequence of the World Wide Web (WWW) expansion of the Internet, opportunities for using documents expressed in foreign languages have increased. Further, since many users desire to scan documents in their native languages, there is a growing demand for low priced machine translation software. However, the quality of the text provided by current machine translation software is unsatisfactory, and there are many translation errors.
Since for a connection on the Internet a translation system must initiate a translation process in real time, high speed processing is required and the performance of complicated procedures, such as deep semantic analysis, is difficult. Generally, therefore, such a system is equipped with a dictionary to reduce the number of unknown words, and for document scanning, more or less ambiguous translations are prepared that are at least prevented from straying to far from the point. To avoid complicating the process and to increase the accuracy of a translation, the data structure of such a dictionary tends to be relatively simple, and word translations tend to be registered not only as individual word units (e.g., a single word dictionary), but also as compound word units (e.g., a compound word dictionary). During a translation, since the simple data structure has a poor word selection function, when there are words for which the translation is registered by the units of compound words, the selection of the translation registered for the compound word unit frequently results in a better translation.
Further, in general isolated translation of individual sentences is performed. As a result, for a specific word that is repeatedly used in a plurality of locations in the same text, there may also be given a plurality of translations. For one location, a translation may be selected from an entry in a single word dictionary, and for another location a translation may be selected from an entry in a compound word dictionary.
To resolve this problem, according to a machine translation method disclosed in Japanese Unexamined Patent Publication No. Hei 3-135666, in a translation process information concerning a translation that is obtained as the result of a dictionary search is saved in a main memory, and is re-used for the same word, so that the time spent searching a dictionary located in an auxiliary storage device is saved and so that the translation of the word is consistent. With this method, however, when an incorrect translation is first selected for a word, the incorrect translation is used in all the locations in a document at which that word appears.
For a method employed for the processing of a plurality of sentences, which is disclosed in Japanese Unexamined Patent Publication No. Hei 2-228765, for a document consisting of a plurality of sentences, the inherent ambiguity of each sentence is calculated and translation is initiated for the least ambiguous sentence. The results obtained for a polysemous word in a preceding sentence are used for a succeeding sentence in order to increase the accuracy in the selection of an appropriate translation and in order to provide a consistent translation. This method, however, is premised on the assumption that a translation will be output after all the sentences in the document have been processed, and thus it can not be employed for a process by which sentences are successively translated from the beginning of a document, as when a translation process is initiated in real time while a system is connected to the Internet.