1. Field of the Invention
The present invention relates to an information processor, a method of processing information, and a program, and particularly to an information processor, a method of processing information, and a program that are used preferably in a case of carrying out statistical natural language processing, such as a synonym analysis, a polysemic analysis, a relevance analysis between two nouns, and a modality analysis of a word, based on context information (for example, a proper noun and a predetermined number of word groups existing before and after that) in a document.
2. Description of the Related Art
In the past, attempts to acquire knowledge are widely carried out by statistically analyzing (carrying out statistical natural language processing) a large amount of documents. For example, in a specialized field where a thesaurus is not developed, automatic creation of a thesaurus in that field and the like are carried out by carrying out statistical natural language processing to documents in the specialized field. Knowledge acquired in such a manner can be utilized for, for example, an application program for information retrieval and the like.
In statistical natural language processing, a characteristic amount of context information (meaning a word group containing a focus word in a document and a predetermined number of words existing before and after that) is utilized frequently. Then, by calculating similarity of the characteristic amount of the context information, the focus word is subjected to a synonym analysis, a polysemic analysis, a relevance analysis between two nouns, a modality analysis of the word, and the like. For example, in “Discovering Relations among Named Entities from Large Corpora”, by Takaaki Hasegawa, Satoshi Sekine and Ralph Grishman, In Proceedings of the Conference of the Association for Computational Linguistics 2004, a characteristic amount of context information is utilized for a synonymy analysis of relevance of proper nouns.