1. Field of the Invention
The invention relates generally to a method for constructing a Chinese dictionary and apparatus and storage media using the same, and more particularly, to a method for constructing a Chinese dictionary and apparatus and storage media using the same, wherein a Chinese collocation is provided for a Chinese term according to a probability for nominalization of the Chinese term.
2. Description of the Related Art
As the increasing prevalence of the internet, one not only can obtain the desired information from the traditional books, but also from the internet For many Chinese learners, on-line Chinese dictionaries or electronic dictionaries have been the important tools for learning Chinese. In light of this, constructing an on-line Chinese dictionary or an electronic dictionary which provides complete teaching functions has become an important issue.
For a querying operation of an on-line Chinese dictionary located via the internet or an electronic dictionary, a user typically enters a Chinese term, and in response, the Chinese dictionary lists, in addition to the definition (assumed from hereforth), the possible parts-of-speech and corresponding collocations of the queried Chinese term for reference and learning. As an example, when a user's query is the Chinese term “xue xi   the Chinese dictionary will list all the possible parts-of-speech, such as verb, noun, adjective and so on, and corresponding collocations of the queried Chinese term “xue xi   Chinese example sentences including the queried Chinese term and corresponding collocations for each part-of-speech respectively would be listed. In the case of the queried Chinese term “xue xi  being used as a verb part-of-speech, a conventional Chinese dictionary would list “ta “xue xi” zhong wen  as a Chinese example sentence. Here, corresponding collocations would include a pre-term subject “ta  and a post-term object “zhong wen  respectively inserted preceding and following the queried Chinese term “xue xi  A similar process would be performed for listing a Chinese example sentence using “xue xi  as a noun part-of-speech.
One method for constructing a Chinese dictionary is provided by the Chinese Word Sketch Engine disclosed by the Academia Sinica of Taiwan. The Chinese Word Sketch Engine determines Chinese collocations according to English grammar, and constructs a Chinese dictionary based on the Chinese collocations. However, the Chinese Word Sketch Engine does not take into account part-of-speech differences between English and Chinese. As such, erroneous determinations may be provided. Following are 3 Chinese example sentences provided by the Chinese Word Sketch Engine, following query of the Chinese term “xue xi 
TABLE 1Chinese Word Sketch Engine of the Academia Sinica of Taiwanxue xi  (VC) +ObjectiveChinese Example Sentencehuan jing  . . . rang xue sheng jin bu de ying yu xue xi“huan jing”   ma lie zhu yi. . . ta zhi chu  , jun dui yao ren zhen xue xi “ma lie zhu yi”    zhong wen  . . . wo zheng zai xue xi “zhong wen” (  
As shown in Table 1, only 2 of the 3 example sentences are appropriate example sentences of the Chinese term “xue xi  being used as a verb. In Table 1, for the second and third example sentences, the Chinese Word Sketch Engine provided the appropriate respective post-term Chinese collocations “ma lie zhu yi  and “zhong wen  However, for the first example sentence, an erroneous part-of-speech determination of the Chinese term “xue xi  occurred. In the first example sentence, the Chinese term “xue xi  with the post-term Chinese collocations “huan jing )” should be as a noun and not as a verb, although the post-term “huan jing  is a noun. It is the nominalization for the Chinese term “xue xi  and the erroneous part-of-speech determination is due to lack of nominalization determination of the Chinese Word Sketch Engine.
Another known method for constructing a Chinese dictionary utilizes the Smadja Xtract system. The Smadja Xtract system for constructing a Chinese dictionary determines Chinese collocations according to statistics. However, nominalization determination is also not provided by the system, thus, erroneous Chinese example sentences may be provided.