In order to obtain more appropriate translation results, a method presently available for machine translation uses multiple electronic dictionaries including domain dictionaries and a switching mechanism of the electronic dictionaries. Each of the dictionaries carry appropriate terms for use when processing documents written for specific domains.
For example, for the sentence, “People enjoy cherry blossoms this season (translation: Hitobito wa, kono kisetsu sakura wo tanoshimi-masu), “this season” should be translated as “kono kisetsu”. On the other hand, in a sport related context, for the sentences, “He is a major leaguer. He hit fifty home runs this season (translation: Kare wa mejaa-liigaadesu. Kare wa kon shiizun 50 pon houmuran wo uchimashita)”, “this season” should be translated as “kon shiizun”. In this context, a dictionary for sports is used. In the dictionary, “kon shiizun” is registered as the translation of “this season”. Thus, where a base dictionary is used in which general terms (those not related to technical domains) are registered in association with source words, the translation for “this season” is “kono kisetsu”. Where a dictionary prepared for sports is used, the appropriate translation, “kon shuizun”, is obtained. Therefore, it is apparent that if an appropriate electronic dictionary is not used, an appropriate translation may not be obtained. Thus, to achieve more accurate translations, multiple electronic dictionaries, including domain dictionaries, must be available and used as required.
A method for switching among multiple electronic dictionaries, disclosed in Japanese Unexamined Patent Publication No. 2001-110185, prepares trigger patterns (hereinafter referred to simply as patterns) that are used to determine context domains. When a trigger pattern appears, the context domain to which the pattern belongs determines which of multiple dictionaries should be used for the translation.
As an example, the phrase “major leaguer” can be a trigger pattern for sports. When the trigger pattern is encountered in the initial sentence “He is a major leaguer.”, that pattern indicates the context is sport related. Priority is given to an available sports-based dictionary so that in the second sentence “this season” is translated appropriately as “kon shiizun”.
However, a problem with this method is that a context-specific dictionary cannot be selected until a context-specific pattern is detected. A translation initiated with a base dictionary may be inappropriate for the actual context until a trigger pattern is encountered and a switch is made to a context-specific dictionary.
The recent increased availability of network use techniques, especially for the Internet, has made it easy for people to access network resources from which a broad spectrum of data may be retrieved. However, since network records can appear in various languages a user often finds difficulties in obtaining useable information from a network record written in an unfamiliar language. Therefore, it is desirable to improve machine translation of resources on networks.
A conventional method is available that provides for the switching of multiple electronic dictionaries for the machine translation of network resources. A URL (Uniform Resource Locator) representing an address at which a resource is located on a TCP/IP network is registered as associated with a specific electronic dictionary that a user determines to be appropriate for the subject matter at the URL. Where a request for translation of the subject matter is made, the associated electronic dictionary is used. This method can cope with sentences in which trigger patterns have not yet appeared or for a case wherein a pattern does not appear at a predetermined interval.
However, since a user must manually register an association between a particular electronic dictionary and the URL, the above method requires a great deal of user effort. Further, if the subject matter of the registered URL changes, continued use of the initially-associated electronic dictionary may result on poor quality translations until an association between the URL and the appropriate electronic dictionary can be manually corrected.
It has been determined that resources available on a network can be sorted into the following four types:                (1) a “no domain type” having a topic not related to a specific domain and no patterns that matches patterns registered in a domain dictionary;        (2) a “multiplex domain type” having multiple topic coexisting on the same page;        (3) a “domain change type” having a topic which can change as the subject matter at the URL is updated; and        (4) a “specific domain type” having contents related to a topic specific to a particular domain. When a domain dictionary is applied for a “specific domain type” resource, an appropriate translation can be obtained. It should be noted that when a domain dictionary is applied for a resource other than a “specific domain type”, inappropriate translation results will be obtained.        