Recently, relationship mining becomes important as a technology of finding a useful knowledge from a large amount of document data. According to such relationship mining, calculation of the similarity and distance between subjects of mining is requisite. Examples of the subject of relationship mining are a relationship between key words present in a document, a relationship between intrinsic representations, such as the names of persons, and the names of organisms, a relationship between a document and a keyword, and a relationship between documents. Hereinafter, the subject of relationship mining will be generally referred to as a mining target.
For example, patent literature 1 discloses a technology of visualizing a human relationship as a weighted network by calculating, as a co-occurrence degree, the rate of appearance of the name of each person in the same document through a web search engine with the names of persons being as mining targets. Moreover, patent literature 2 discloses a technology of searching a document close to the preference of a user using a cosine similarity with documents being as mining targets.
Patent literature 3 discloses a technology of using the length of a backward matching letter string as the similarity of lexical information at the time of calculation of the similarity between words. Moreover, patent literature 3 calculates the linear sum of plural similarities, such as a first similarity based on the co-occurrence degree in a dependency relationship between words, and a second similarity based on the consistency of the meaning category where the words belong, thereby determining the similarity.