Technical documents such as reports, papers and patent applications created by an engineer may be translated into another language by a translator who is not expert in the field. When a compound word (coined word or industry jargon) used by the document author is translated, the meaning or indication of the word may be clear to the expert who have written the document, but, without technical knowledge that provides the premise of the word usage, the translator may not know how to translate it. In such a situation, a method may be adopted, with which corpus frequencies of the compound word and compound words containing partial word strings of the target compound word are examined, and a word with a low frequency is notified, with alert, as an infrequently used compound word or in other words, a coined word (see, for example, JP-A 2001-249921 (KOKAI)).
The translator, however, may still have difficulties in selecting appropriate words in the process of translation, by use of the corpus frequencies only. For instance, a compound word “ ” which is coined from a noun “ (event)” and a sa-hen noun “ (extraction)” may be used.
If the corpus frequency of the word “” is greater than a predetermined threshold, this compound word would not be determined as a coined word. When a translator with little expertise has to translate the word “” into a different language, for example, into English and it is assumed that the translations of “” and “” are “event” and “extraction”, it is difficult for the translator to determine whether the translation of “” should be “event extraction”, “extraction from an event” or “extraction of an event”.
Furthermore, as a method of determining whether the compound word “” is coined, a string search simply for two words “” and “” or wild-card matching for “ (event*extraction)” can be considered. With such methods, however, word strings such as “ (event data series extraction)” and “ (relevance extracted from event data)” would be found, and whether the compound word “” is a coined word is difficult to determine based on these word strings. Because the translator cannot determine whether the term “” is a coined word based on its frequency of use, processes of sending an inquiry to the author, receiving an answer from the author and proofing the original document are required. As a result, it may take a long period of time to complete the translation.