In documents such as manuals, papers, and design documents, which are desired to satisfy logical integrity, the same expression may be used to represent the same meaning and another expression may be used to represent another meaning. Accordingly, the work of revising a document involves determining whether different expressions are used to represent the same meaning and whether the same expression is used to represent different meanings. Different expressions representing the same meaning are called “synonyms”. One and the same expression representing different meanings is called a “polyseme”.
A procedure to determine whether an expression is a synonym or polyseme is as follows:
1) A sentence is divided into words by morpheme analysis processing;
2) The divided words are collated with dedicated dictionaries such as the synonyms/polysemes dictionary to specify a determination target; and
3) A person checks and determines that the determination target in one of the plurality of sentences is a synonym or polyseme.
Moreover, a technique of determining whether a predicate pair or an “argument-predicate pair” is synonymous, antonymous, or irrelevant is disclosed (for example, see Japanese Laid-open Patent Publication No. 2015-28697). In this technique, a determination apparatus extracts at least one of a dictionary definition sentence feature and a sematic attribute feature of each predicate pair which is stored in a learning corpus storage unit and is classified into any of synonymous, antonymous, and irrelevant in advance. Further, the determination apparatus extracts a feature representing a pair of substrings and a feature representing the likelihood of parallel predicates of the substrings, constructs a feature for classifying the substrings to synonyms or antonyms, and learns a classification model for classifying the substrings to synonyms or antonyms.
Other related techniques are disclosed in, for example, Japanese Laid-open Patent Publication Nos. 2010-102521 and 2012-73951.