1) Field of the Invention
The present invention relates to a system for and a method of analyzing the dependency structure of each word after performing a word dividing process on a Chinese sentence, and a computer program product performing the method.
2) Description of the Related Art
In a machine translation process from the Chinese language into another language (the Japanese language, for example), an input Chinese sentence is divided into words through morphological analysis, and the dependency destination and the dependent(s) or each of the words is analyzed.
Many Chinese words are made up of two characters. Among those two-character words, there are words that have only a weak link between the morphemes. Another component (an insertion component) can be inserted between the morphemes of each of the two-character words. Such a word that can link morphemes or have an insertion component between the morphemes is called a lihe-word.
In the list shown in FIG. 1, for example, a word C1 is a Chinese verb meaning “to take a walk”. To form a phrase “to take a walk for a while”, a modifier is inserted between a word C3 and a word C4, as shown by a phrase C2. In this case, each of the words C3 and C4 is an independent word. However, a combination of the words C3 and C4 does not have the meaning “to take a walk”. Therefore, in the phrase C2, the word C1 should be regarded as one word.
The existence of those lihe-words makes the Chinese analyzing process difficult in performing the Chinese machine translation. To counter this problem, insertion words that can be inserted between the head element and the last element of each lihe-word are listed in advance. A dictionary is then referred to in a morpheme analysis that is carried out for an input Chinese sentence, and each morpheme is determined whether to form a lihe-word. In case of a morpheme that forms a lihe-word, processing such as a dependency structure analysis and a meaning analysis is performed in this order, with each word unit being a word containing two or more characters (see “Lihe-word Processing in Chinese-to-Japanese Machine Translation” IPSJ Journal, Vol. 35, No. 9).
However, there are various kinds of insertion words that can be inserted between the head element and the last element of a lihe-word in the Chinese sentence. Therefore, it is very difficult to list all the insertion words in advance.
Even if all the insertion words can be listed, the number of them is so large that it becomes complicated to search the list of the insertion words for a desired insertion word in the morpheme analyzing process.