In recent years, with the rapid development of the Internet, a large amount of information has been continuously distributed in the world through web pages, electronic bulletin boards, and blogs on the Internet. Since a large amount of information is distributed, for an information user, the cost of finding information of interest among information on the Internet increases, and an appropriate analysis technique is required.
Nowadays, services for providing information related to a variety of keywords such as a keyword that attracts attention or a keyword that is popular in a web page are being performed. For example, as one of the services, there is a service in which when a certain keyword attracting attention is present, a clue to discover the reason why the certain keyword attracts attention is suggested (see Patent Document 1).
The service disclosed in Patent Document 1 uses a technique of detecting and suggesting information having correlativity with information that the user desires to know. Specifically, in Patent Document 1, a keyword that co-occurs at a high frequency at a certain point in time and is close in occurrence time with respect to a keyword attracting attention is detected. A co-occurrence graph showing the keyword attracting attention and the detected keyword is created. The user can discover the reason why the keyword attracting attention attracts attention by analyzing the co-occurrence graph.
However, in the case of detecting information having, correlativity with information that the user desires to know by using the technique disclosed in Patent Document 1, even information recognized as having correlativity due to a coincidental cause may be detected.
The reason is as follows. That is, in the technique disclosed in Patent Document 1, not only co-occurrence at a high frequency with a keyword attracting attention at a designated point in time but also whether or not an appearance time is close to a designated point in time are used as conditions of correlativity determination. Thus, when an appearance time is close to a designated point in time, the determination is greatly influenced. Thus, in the case of using the technique disclosed in Patent Document 1, there is a problem in that information recognized as having correlativity due to the coincidental cause cannot be excluded.
For example, in the web pages on the Internet, a linguistic expression such as a description or an opinion related to an important event may be frequently recalled by a certain event and described by chance. In this case, information intrinsically having no correlativity may be erroneously recognized as having correlativity.
Meanwhile, linguistic expressions having semantically strong correlation tend to be very often continuously used, but it is typically difficult to determine whether a plurality of linguistic expressions appear at a time close to each other due to strong correlativity or incidentally appear at a time close to each other.
In this disclosure, a description representing a noun, a topic, an opinion, and an event in a text as well as a word including a keyword is referred to as a “linguistic expression.” The “linguistic expression” may be a character string itself that appears in a text or a result obtained by analyzing a text by using an existing natural language processing technique such as morphological analysis, syntactic analysis, dependency analysis, or synonym processing.
Specifically, for example, “tobacco” and “health” are linguistic expressions each including one word. A dependency analysis result between words such as “tobacco→harmful” obtained by performing dependency analysis on a text “tobacco is very harmful to health” is also a linguistic expression representing one unitary meaning.    Patent Document 1: Japanese Unexamined Patent Application, First Publication No. 2006-164045