FIG. 9 is an explanatory illustration showing an example of specific document clustering processing executed by a multipurpose active metric learning device. This example herein shows a case where document clustering processing is performed on a vast amount of composition data acquired from various kinds of websites (mainly blogs, bulletin boards, and the like) by utilizing a crawler. FIG. 9 is an example of a part of a result of a hierarchical-clustering acquired from this data, which shows only a small part of a great number of words contained in the document data. Based on the data that is hierarchically clustered in this manner, data used for a search engine and the like can be built, for example.
The active metric learning device used for such document clustering processing is constituted with a user feedback input device, a metric optimization unit, a data analyzing device to which the optimized metric is applied, etc., and it learns the metric based on information from the user. When the user performs data analysis such as document clustering, first, an analysis is executed by using a metric that is not optimized.
The user refers to the result, and feeds it back to the device. Upon that, the device executes metric learning after converting the feedback into a form that can be handled by the metric optimization unit. Thereby, the device provides the user with important information that gives an influence upon a learning result of the metric learning, and increases the efficiency in creating the feedback and learning the metric. Specifically, when optimizing Mahalanobis matrix, used is a feedback regarding Mahalanobis distance (e.g., make the distance between data farther or closer), a direct feedback regarding matrix elements (importance of attributes, relevancy between attributes), etc.
Relative to that, Patent Document 1 discloses one method which structuralizes attributes in clustering processing. Patent document 2 discloses a technique which performs searching of data by using a result acquired by calculating the distance between target feature vectors. Patent Document 3 discloses an example of a technique which specifically uses the calculation of the Mahalanobis distance. As will be described later, Non-Patent Documents 1 and 2 disclose specific methods of clustering, such as Information-theoretic Co-clustering, Bregman Clustering, and spectral clustering.    Patent Document 1: Japanese Unexamined Patent Publication 2005-235099    Patent Document 2: Japanese Unexamined Patent Publication 2006-031460    Patent Document 3: Japanese Unexamined Patent Publication 2008-185399    Non-Patent Document 1: Information-Theoretic Co-clustering, I. S. Dhillon, S. Mallela, and D. S. Modha, Proceedings of The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 89-98, August 2003.    Non-Patent Document 2: Clustering with Bregman Divergences, A. Banerjee, S. Merugu, I. S. Dhillon, J. Ghosh, The Journal of Machine Learning Research, Vol 6, December 2005