According to Abe et al, active learning is a learning form in which a learner can actively select learning data ([1] Naoki Abe, Hiroshi Mamitsuka, “Nodou Gakushu to Hakken Kagaku (Active Learning and Discovery Science),” in “Hakken Kagaku to Deta Mainingu (Discovery Science and Data Mining),” edited by Shinichi Morishita, Satoru Miyano, Kyoritsu Shuppan, Jun. 2001, ISBN 4-320-12018-3, pp. 64-71), It has been generally known that an learning efficiency can be improved in terms of the count of data and computational amount by actively performing the learning. A system which performs the active learning is called an active learning system. Consider, for example, a learning system which statistically analyzes collected data, and predicts results for data having unknown label values from a tendency of past data. The active learning system can be applied to such a learning system. In the following, a general description will be given of this type of active learning system.
Assume that there exist data having unknown label values and data having known label values. Learning is performed with the data having known label values, and the result of the learning is applied to the data having unknown label values. In this event, the learning system selects data with which the learning can be efficiently performed from the data having unknown label values, and delivers the data. The delivered data is subjected to an experiment or an investigation to derive results for the data having unknown label values. The results are entered and merged with the data having known label values, and then the learning is performed in a similar manner. On the other hand, data from which the results have been derived are deleted from a set of the data having unknown label values. The active learning system repeatedly performs such operations.
Also, data is expressed in the following manner. One data is described with a plurality of attributes and a so-called label. For example, there is “golf” within famous evaluation data. This determines whether or not the golf should be played or not, and is described by four items: weather, temperature, humidity, and wind force. The weather takes a value “fair,” “cloudy,” or “rainy” while the wind takes a value “present” or “absent.” The temperature and humidity are real values. For example, one data is described as: weather: fair, temperature: 15° C., humidity: 40%, wind: absent, play: done. In this data, the four items, weather, temperature, humidity, and wind are called attributes. Also, the result of play done or not done is called a label. In this description, when the possible values of the label are discrete values, the value is particularly called a “class.”
Now, a variety of terms will be defined.
Suppose that the label is binary. Out of the two values, a noted label is a positive instance, while the other one is a negative instance. Also, with a multi-value label, one noted label is a positive instance, while all except for that are negative instances. When a label can take a continuous value, a label value located near a noted value is called a positive instance, while one located at another position is called a negative instance.
Indexes for measuring the accuracy of learning include an ROC (receiver operating characteristic) curve, a hit rate, a transition in correct answer rate and the like. In the following description, these three indexes are used to make evaluations.
The ROC curve is defined in the following manner:
Horizontal Axis: (Count of Data Determined to be Positive Instances within Negative Instances)/(Total Count of Negative Instances),
Vertical Axis: (Count of Data Determined to be Positive Instances within Positive Instances)/(Total Count of Positive Instances).
When a random prediction is made, the ROC curve appears to be a diagonal which connects the origin with (1, 1).
The hit rate is defined in the following manner:
Horizontal Axis: (Count of Data Having Known Label Values)/{(Count of Data Having Unknown Label Values)+(Count of Data Having Known Label Values)},
Vertical Axis: (Count of Positive Instances within Data Having Known Label Values)/(Total Count of Positive Instances).
When a random prediction is made, the hit rate appears to be a diagonal which connects the origin with (1, 1). Also, limits appear to be a line which connects the origin with ([Count of Positive Instances]/[(Count of Data Having Unknown Label Values )+(Count of Data Having Known Label value)]), 1).
The transition in correct answer rate is defined in the following manner:
Horizontal Axis: Count of Data Having Known Label Values.
Vertical Axis: (Count of Correctly Determined Data)/(Count of Data Having Known Label Values).
In “Best Mode for Carrying out the Invention” later described, an active learning system according to the present invention is evaluated using these indexes (see FIGS. 3A to 3C, 5, 7, 9, 11, 13A, 13B, 15A, 15B, and 18).
Entropy is defined in the following manner. Assume that each P_i indicates the probability of being i.Entropy=−(p—1*log(P—1)+p—2*log(P—2)+ . . . +P—n*log(P—n))
A conventional active learning system is disclosed in JP-A-11-316754 [2]. The active learning system disclosed in this gazette is characterized by performing, for improving a learning accuracy, a learning step for forcing a lower-level algorithm to perform learning, a boosting step for improving the learning accuracy through boosting, a step for predicting function values for a plurality of candidate input points, and an input point specifying step for selecting an input point which presents the smallest difference between a weighted sum of output values with the largest sum total of weights and a weighted sum of output values with the next largest sum total of weights.
Abe et al. [1] further disclose an approach using a system which comprises a plurality of learning machines, where each learning machine randomly samples data to learn the data, and the respective learning machines perform a prediction for data having unknown label values to deliver a point at which a variance is maximized as a point which should be next learned.