In recent years, with rapid development of transportation and communication networks, an enormous amount of information has come to be globally distributed. For instance, literary works, business documents, technical articles, and patent documents spread across international boarders. Most of such information includes linguistic information expressed in natural language such as English and Japanese. For using or distributing information, there are increasing needs to translate the information from original language to other languages. However, most of business-related documents are still translated by translation specialists. This is because existing machine translation techniques are still not sufficient for automatically producing good enough translations for business use.
In order to help translators produce fast and high-quality translations, some techniques for supporting translation work have been demanded. One of such demands is a technique of automatically extracting correspondences between words by comparing original text with its translation. This is for verifying the translation by comparing words in the translation with their corresponding words in the original text, which is fairly necessary in translation work. Such a comparison technique is also useful for saving pairs of original text and its translation as translation examples for later use.
There exists a technique of receiving one original sentence and one translated sentence and aligning words with reference to bilingual dictionaries for original language and translation language in order to visually represent detected correspondences (for example, see Japanese Laid-open Patent Publication No. 2005-339087). There also exists a technique of consulting terminology dictionaries to automatically correct errors in translation (for example, see Japanese Laid-open Patent Publication No. 05-342259). Further, for receiving and comparing documents including plural sentences, not sentence by sentence, there exists a technique of, in response to specification of partial correspondences between original and translated sentences, estimating other correspondences between the original and translation sentences before and after the specified correspondences in order to reduce a workload on a translator (for example, see International Publication Pamphlet No. WO2004/107203).
However, the techniques disclosed in the above Japanese Laid-open Patent Publications Nos. 2005-339087 and 05-342259 and International Publication Pamphlet No. WO2004/107203 have drawbacks that a workload of confirming translated words is not so reduced if many sentences need to be compared between an original document and its translation. More specifically, the techniques disclosed in the Japanese Laid-open Patent Publications Nos. 2005-339087 and 05-342259 impose conditions that accurate correspondences between original sentences and translated sentences need to be specified, and if the correspondences have any error, the techniques malfunction. The technique disclosed in International Publication Pamphlet No. WO2004/107203, on the other hand, needs partial accurate correspondences between sentences to be specified as a reference in order to estimate the other correspondences between sentences. Accordingly, an extra workload is placed on a translator.
Especially, it is not always the case that one original sentence corresponds to one translated sentence. One-to-two or two-to-one correspondence also exists. However, if an original document and its translation have a different number of sentences, the techniques disclosed in International Publication Pamphlet No. WO2004/107203 may reduce accuracy in correspondence estimation, which brings about an extra workload of correcting the estimated correspondences.