In a retrieval system, the ranking of search result documents is important, in order to quickly find a target document. A ranking technique that is generally known involves documents that have been favorably evaluated by a large number of evaluators being ranked higher.
Incidentally, with the abovementioned ranking technique, there is a problem in that documents that have hardly been evaluated or have yet to be evaluated, such as documents that have only recently been created, will be unfairly ranked lower (or higher). In view of this, a technique is known in which the evaluation value of documents is estimated from feature values such as the author of documents or the creation date/time of documents, through learning using a log of user evaluations of documents, and these evaluation values are used in ranking.
However, these feature values may be missing due to factors such as recording omissions. When feature values are missing in this way, learning is not possible with a normal learning algorithm. In response to such problems, Patent Document 1 describes an example of a learning system that handles training data in which feature values are missing.
As shown in FIG. 14, a learning system 10 described in this Patent Document 1 is constituted by a missing value supplementing unit 11 and a prediction model learning unit 12, and operates as follows. First, the missing value supplementing unit 11 receives input of training data containing both sample documents in which feature values are missing and sample documents in which feature values are not missing. Then, the missing value supplementing unit 11 learns a function for estimating missing feature values from other feature values, using the sample documents in which feature values are not missing as inputs.
Next, the missing feature values are supplemented using the estimated function, and a set of sample documents in which missing values have been supplemented is output to the prediction model learning unit 12. The prediction model learning unit 12 then learns a function for estimating a target variable based on feature values, using training data containing the sample documents in which missing values have been supplemented. As mentioned above, in the learning system 10 described in the abovementioned Patent Document 1, if sample documents in which feature values are missing are included in training data, a function for estimating the target variable is learned after supplementing the missing feature values.