In general, if there are a plurality of expressions (words) for the same notion, this case is called an orthographical variant. If the orthographical variant exists in a document, the terms having the same notion may not be properly extracted when a user searches the document or extracts a specific term from the document, and the like.
Here, there are known various techniques relating to the orthographical variant. For example, there is known a method in which a dictionary is created in advance by selecting character strings considered as orthographical variant candidates from a target document, and a character string of the orthographical variant candidate is detected based on this dictionary.
However, in this method, since the orthographical variant candidates are to be manually selected in advance to create the dictionary, efficiency is disadvantageously degraded.