In recent years, different data ranging from information on contents such as TV programs and books to information on landmarks such as tourist spots and restaurants and on reputations and stocks of commercial products have been increasingly digitized. With the digitization of the data, there are growing needs of information search tools which help a user to find, from a lot of digitized information, information interesting to the user.
The most common method for searching interesting information from a lot of information is a keyword search method. In the keyword search method, a user inputs a word indicating an interest of the user as a search keyword, and a system searches information associated with the inputted search keyword using indices generated beforehand based on degrees of association between the keyword and documents, and presents a result of the search to the user.
The keyword search method is effective when an interest of the user is clear and the interest can be clearly expressed with a search keyword. However, since the inputted search keyword does not match the interest of the user when the interest of the user is vague or an appropriate search keyword indicating the interest is not conceived, a problem arises that information the user really wants to search is not ranked high in a search result.
To cope with such a problem, information to be searched is divided into clusters, and keywords and indices that represent the information included in each cluster are presented to the user so that contents of each cluster are informed to the user. Examples of an information search support method in which information is narrowed down in the above manner while giving a clue to the user having a vague information search purpose include the Scatter/Gather method (Non-patent Reference 1). In the Scatter/Gather method, when the user selects a cluster interested in by the user, a system first gathers information such as documents and contents which is included in the selected cluster once, then performs clustering again, and presents a result of the clustering to the user. Recursively repeating this process narrows down search targets and gradually clarifies a vague interest of the user. Consequently, the user will be able to easily find the information interesting to the user.
On the other hand, although the user selects a cluster using, as clues, keywords and an index that represent the cluster, it is difficult to know all the information included in the cluster only with the keywords and the index. Thus, a problem arises that “omission” in which information that matches the interest of the user and is included in a cluster that is not selected is omitted from a search target occurs when the user selects a cluster.
Patent Reference 1 is disclosed for this problem. In Patent Reference 1, assigning information to be searched into a single cluster is thought to cause the aforementioned problem, and a solution of the problem is attempted by calculating a degree of belonging of the information to be searched with respect to each of clusters, presenting a result of the calculation on a bar graph and the like, and suggesting other clusters to be selected.    Non-patent Reference 1: Scatter/Gather: A cluster-based approach to browsing large document collections. In Proceedings of the SIGIR'92 (pp. 318-329), 1992    Patent Reference 1: Japanese Unexamined Patent Application Publication No. 2003-345810