1. Field of the Invention
The present invention relates to a method, an apparatus and a program for information retrieval for retrieving information, which matches a user preference, from many pieces of information, and specifically relates to the method, the apparatus and the program for information retrieval realizing correct information retrieval in a short time by applying a clustering method.
2. Description of the Related Art
A technique for music retrieval based on a user preference is disclosed in the patent document 1 and the non-patent document 1. Herein, an acoustic feature of music is analyzed based on the music and preference information (sample of preferred music) input by a user as a query, and the music, which matches the user preference, is retrieved and is presented to the user. Also, by utilizing matching feedback information from the user, retrieval accuracy is improved.
As an improvement of the above-described technique, in the patent document 2 and the non-patent document 2, a method of improving the retrieval accuracy by clustering retrieval target music and rebuilding a feature space by utilizing the clustering result is disclosed.    [Patent Document 1] Japanese Patent Application Laid-Open No. 2003-316818    [Patent Document 2] Japanese Patent Application Laid-Open No. 2006-243887    [Non-Patent Document 1] K. Hoashi et al.: Personalization of user profiles for content-based music retrieval based on user preferences, Proc of ACM Multimedia 2003, pp. 110-119, 2003.    [Non-Patent Document 2] K. Hoashi et al.: Feature space modification method for content-based music retrieval based on user preferences, Proc of ICASSP 2006, Vol. V, pp. 517-520, 2006.
In all of the above-described conventional arts, all of the pieces of the retrieval target music are compared with the query and it is judged whether the result thereof matches the user preference based on a similarity thereof, so that the larger the number of pieces of the retrieval target music is, the longer a processing time of the information retrieval is. Then, when the number of pieces of the retrieval target music is enormous, it could be difficult to build a practicable system.
In the above-described conventional art, although it is assumed that the sample of a plurality of pieces of music to which the user prefers is input as the query, when the acoustic feature of the music included in the query is significantly different, it could be highly possible that this negatively affects the accuracy of the retrieval.
For example, in a case in which a piece of quiet music and a piece of lively music are input as the preference information, since the query is generated by summing feature vectors of both pieces of music in the above-described conventional art, the query has an intermediate feature of the both pieces of music and has the feature of the music not quiet and not lively. Many pieces of music retrieved based on such a query are the ones having a feature not similar to the music input by the user, and as a result, this may deteriorate the retrieval accuracy for the user.