1. Technical Field
The present disclosure relates to an apparatus, a method, and a non-transitory computer-readable recording medium for generating semantic information concerning a word to deal with the meaning of text information in a natural language.
2. Description of the Related Art
Related art techniques generate semantic information for a word that forms text to deal with the information of text information in a natural language. The related art techniques are disclosed in Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean, “Efficient Estimation of Word Representations in Vector Space”, ICLR 2013, and Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean, “Distributed Representations of Words and Phrases and their Compositionality”, NIPS 2013. The related art techniques learn a multi-dimensional vector to be assigned to each word contained in a large amount of text data sets (hereinafter referred to as a text corpus), and then output an association between a word and a multi-dimensional vector (semantic information) corresponding to the word.
The semantic information generated in the related art techniques may be used to determine whether words are similar in meaning.
In the related art techniques, however, semantic information assigned to a given word is similar to semantic information assigned to another word which needs to be differentiated from the given word. There is still room for improvement in the determination as to whether the words are similar in meaning.