1. Field of the Invention
The present invention relates to an information processing apparatus, an information processing method, a program for implementing an information processing method, an information processing system, and a method for an information processing system, and more particularly, to an information processing apparatus, an information processing method, a program for implementing an information processing method, an information processing system, and a method for an information processing system, in which information indicating preference of a user is used in calculation of similarity to properly select a content to be recommended to the user.
2. Description of the Related Art
It is known to select a content such as a television or radio broadcast program that matches preference of a user on the basis of content information such as an EPG (Electronic Program Guide) and recommend the selected content to the user. Hereinafter, a television or radio broadcast program will be referred simply as a program unless confused with a computer program. Various methods are known to acquire information indicating the preference of a user, and a content is recommended to the user in various manners depending on the method of acquiring information indicating the preference of the user. For example, television broadcast programs viewed by a user is logged, and a program is recommended based on the log data.
In this technique, each time a program is viewed by a user, meta data associated with the viewed program is stored. When the amount of stored meta data has reached a particular level, a weight is assigned to each meta data (or data indicating an attribute that is common to a plurality of program meta data) depending on the frequency of occurrence or by means of tf/idf method. A vector is then produced for each program meta data such that elements of the vector are given by the respective weights (hereinafter, such a vector will be referred to as a feature vector). Furthermore, a vector indicating preference of the user (hereafter, referred to as a user preference vector) is produced based on one or more feature vectors. In the conventional technique based on program viewing log data, the user preference vector is used as information indicating the preference of the user. The similarity of a content meta vector associated with a candidate program (vector whose elements are given by weights assigned to program meta data of the candidate program) relative to the user preference vector. If it is determined that the similarity is high, the candidate program is recommended to the user.
In this conventional technique based on the program viewing log data, the number of dimensions (the number of elements) of the user preference vector increases and thus the complexity of operation increases with increasing types of contents (TV programs) (that is, with the number of program meta data). To solve the above problem, various techniques are known to reduce the number of dimensions.
For example, it is known to project all vectors onto optimum bases (axes) by means of singular value decomposition (principal component analysis) thereby reducing the number of dimensions (for example, refer to (1) Japanese Unexamined Patent Application Publication No. 2001-155063, (2) “Computer information retrieval using latent semantic structure”, U.S. Pat. No. 4,839,853, Jun. 13, 1989, (3) “Computerized cross-language document retrieval using latent semantic indexing”, U.S. Pat. No. 5,301,109, Apr. 5, 1994).
However, in the conventional technique based on the program viewing log data, the user preference vector (used as a reference in calculation of similarity of a candidate program) does not necessarily indicate correct preference of a user, and thus a content (TV program) recommended based on the user preference vector is often refused by the user.