In recent years, an amount of text data has been increasing explosively, and the importance of text search has increased. In particular, the study of semantic processing for secretarial function application software or the like has become popular, and the importance of search of a semantic structure of a natural sentence has been increasing.
Lexical analysis, morphological analysis, semantic analysis, and the like are used to analyze a natural sentence used in text search. Lexical analysis is a process for dividing a character string into words, and morphological analysis is a process for dividing a character string into morphemes and assigning information such as a part of speech or an attribute to the respective morphemes. The morphemes obtained by morphological analysis may be treated as words.
Semantic analysis is a process for obtaining a semantic structure of a natural sentence by using a morphological analysis result of the natural sentence. What the natural sentence means can be expressed as data handled by computers, by using a semantic structure that is a semantic analysis result.
The semantic structure includes a plurality of semantic codes that respectively indicate the meanings of a plurality of words included in the morphological analysis result, and information indicating a connection relationship between two semantic codes. One semantic code may correspond to a plurality of words. The semantic structure can be expressed, for example, by a directed graph that is formed by a plurality of nodes indicating a plurality of semantic codes and arcs that each indicate a connection relationship between two nodes. A minimum partial structure of the semantic structure is referred to as a semantic minimum unit, and is formed by two nodes and an arc between the two nodes.
Semantic structure search for searching a plurality of documents by using a semantic structure of a search request of a natural sentence can be realized by performing morphological analysis and semantic analysis on text data included in the plurality of documents.
Patent Document 1: Japanese Laid-open Patent Publication No. 2013-186766
Patent Document 2: Japanese Laid-open Patent
Publication No. 2010-93414
Patent Document 3: International Publication Pamphlet No. 2012/111078
Patent Document 4: International Publication Pamphlet No. 2011/148511
Patent Document 5: Japanese Laid-open Patent Publication No. 2012-22599
Patent Document 6: Japanese Laid-open Patent Publication No. 2012-150586