1. Field of the Invention
The present invention relates to an inference apparatus and method for predicting a class of a new instance whose class is unknown, or for generating a rule which is common to an instance set.
2. Description of the Related Art
Recently, there is a practice in various technological areas of creating an inference system in order to output a solution for a user's question. For example, in medical terminology there is a practice to diagnose a patient's disease, by use of an "expert system" instead of a doctor. For example, a doctor inputs the condition of the patient to the expert system and the expert system predicts the name of the disease according to medical knowledge. In this case, the expert system previously stores rules based on the knowledge, such as if-then rules, and provides guidance to the solution by using the rules.
To form rules, the user previously inputs an instance set (plural instances) to the expert system and the expert system generates rules from the instance set. In short, it is important for the inference system to generate correct rules from the instance set and to correctly predict the answer to the question according to the rules.
As one method to predict the answer for a question by using the instance set, statistical multiple regression analysis is well known. In multiple regression analysis, a regression equation which represents the kind of instance set is determined and a value of the new instance (question) is assigned to a variable of the regression equation to predict the answer of the new instance. FIG. 1 shows one example of the instance set to generate a regression equation. In FIG. 1, one instance is comprised of attributes (person's stature, weight) and class (health condition). According to a relation between stature and weight, class "O" is marked if the person is in normal health condition and class "X" is marked if the person is in abnormal health condition. FIG. 2 shows a graph of weight vs stature on which is plotted cross points relating stature and weight, and the regression equation which represents the tendency of the group of cross points. If the value of stature and weight of a patient is assigned as a variable of the regression equation "f(X)+ax=by=c", it is decided whether the patient is in normal health condition or not. However, in the multiple regression analysis, it is difficult to generate rules from the regression equation. Furthermore, the attribute-value is comprised of a continuous value only, though a discrete value is also necessary for the attribute. Therefore, it is insufficient to acquire knowledge for the inference system.
As a method to classify the new instance, discriminant analysis is well known. FIG. 3 shows a graph on which normal points "O" and abnormal points "X" representing relations between stature and weight are plotted, as well as the decision standard which discriminates "O" and "X" on the graph. However, in the discriminant analysis, a specialist must determine the decision standard by observing the marked point on the graph. Therefore, the decision standard includes the specialist's subjectivity and it is sometimes incorrect.
On the other hand, in the field of machine learning, a decision tree is determined by the instance set and the answer for a new instance (question) is predicted according to the decision tree ("Induction of Decision Trees" Machine Learning 1, pp 81-106, 1986). FIG. 4 shows the instance set to determine the decision tree. In FIG. 4, one instance is comprised of attributes (weather, wind, humidity of every day) and class (for example, the day is suitable for fishing or not). FIG. 5 is the decision tree which represents the relation between the attributes and class shown in FIG. 4. In FIG. 5, if the weather is rain, the day (today) is not suitable for fishing. If the weather is fine and the wind is strong, the day is not suitable for fishing. If the weather is fine and the wind is weak, the day is suitable for fishing. If the weather is cloudy and the humidity is low, the day is suitable for fishing. If the weather is cloudy, the humidity is high and the wind is weak, the day is suitable for fishing. If the weather is cloudy, the humidity is high and the wind is strong, the day is not suitable for fishing.
If the number of attributes is large, the shape of the decision tree is more complicated. Therefore, it is difficult for a user to extract the relation between attributes from the decision tree. Furthermore, the attribute-value is comprised of discrete values only, though continuous values may also be necessary for attributes. In short, it is difficult for the inference system to generate correct rules as knowledge from the decision tree.