1. Field of the Invention
The present invention relates to an information processing apparatus, an information processing method, a program, and a recording medium. More particularly, the invention relates to an information processing apparatus, an information processing method, a program, and a recording medium suitable for analyzing texts in electronic form.
2. Description of the Related Art
Generally, morphological analysis involves dividing texts written in natural language into morphemes that are the units of linguistic significance and thereby providing morpheme-by-morpheme information (e.g., parts of speech). This analysis is one of the basic techniques for natural language processing and has been practiced extensively.
In the traditional morphological analysis, the words registered in the word dictionary were the units of morphemes. Two functions were basically absent: the function of determining a compound word using the relations between a plurality of morphemes; and the function of segmenting into a plurality of morphemes any one of the words registered as compound words in the dictionary.
If it was desired to extract from the dictionary any registered compound word in the form of segmented words, it was necessary to register in advance the component units making up that compound word in the dictionary, or to register beforehand the most significant of the words constituting the compound word in question (e.g., see Japanese Patent Laid-Open No. 2002-259426).