1. Field of the Invention
The present invention relates to a fuzzy inference system applied to the fields of information processing and control, and more particularly to a fuzzy rule acquisition method and apparatus for a fuzzy inference system, and a fuzzy inference system using the apparatus.
2. Description of the Related Art
Recently, systems using fuzzy inference have been developed more and more, mainly in the control field such as a subway automatic operation system and a tunnel ventilation control system. Fuzzy inference is a method of estimating an output relative to an inference input, by using fuzzy rules and membership functions derived from human experience and knowledge. The details of fuzzy inference are described, for example, in "Introduction to Applied Fuzzy System", by Toshiro TERAO, Kayoji ASAI, and Michio SUGANO, Ohm Ltd., pp. 36-48. The definition of a general fuzzy inference method will be described by referring to the method described in "Introduction to Applied Fuzzy System", pp. 36-39. FIG. 1 is a flow chart explaining the operation of fuzzy inference, and FIG. 2 is a schematic diagram explaining the principle of fuzzy inference. In the following description, the two rules given below are used.
Rule 1: IF [(x1 is small) AND (x2 is medium)] THEN (y1 is medium). PA1 Rule 2: IF [(x1 is large) AND (x2 is small)] THEN (y1 is large). PA1 Ri,0: IF (e is Ei) AND (.DELTA.e is .DELTA.E0) THEN (Output is Ai), and PA1 R0,j: IF (e is E0) AND (.DELTA.e is .DELTA.Ej) THEN (Output is Bi), PA1 i, j=(0, 1, -1), PA1 .DELTA.E0: "0", .DELTA.E1: "positive", .DELTA.E-1: "negative", PA1 E0: "0", E1: "positive", E-1: "negative", PA1 A0: "0", Ai: "positive and large", A-1: "negative and large", and PA1 B0: "0", Bi: "positive and small", B-1: "negative and small. PA1 Ri,j: IF (e is Ei) AND (.DELTA.e is .DELTA.Ej) THEN (Output is Ai+Bj). PA1 IF (e is positive) AND (.DELTA.e is positive) THEN (Output is positive and small+positive and large). PA1 (a) performing a fuzzy inference for at least one input value by using the fuzzy rules of the fuzzy knowledge, and obtaining a result of the execution of the fuzzy inference; PA1 (b) comparing the result of the execution of the fuzzy inference with a teaching value, and obtaining an inference error; PA1 (c) obtaining the errors of the fuzzy rules by using the inference error; PA1 (d) judging whether each of the fuzzy rules is contradictory, based on the errors of corresponding rules of the fuzzy rules; and PA1 (e) modifying a fuzzy rule judged as contradictory to dissolve the contradiction. PA1 a step of judging each pair of fuzzy rules as contradictory fuzzy rules if a ratio between the errors of each pair of fuzzy rules is within a predetermined range and one of the errors is negative and the other of the errors is positive. PA1 a step of modifying the IF part of the fuzzy rule judged as having a larger area by the comparison result. PA1 a step of judging that there is a missing fuzzy rule if the step (d) judges that there is no contradiction for all of the fuzzy rules; PA1 a step of obtaining ones of the errors and grades of the respective propositions based on the inference error; and PA1 a step of, if it is judged that there is a missing fuzzy rule, generating the missing fuzzy rule based on ones of the errors and grades of the respective propositions. PA1 obtaining ones of the errors and grades of the respective propositions based on the inference error; PA1 if it is judged that there is a missing rule, selecting IF parts based on ones of the errors and grades of the prepositions of the respective IF parts of the fuzzy rules; and PA1 coupling the selected IF parts to generate the IF part of the missing fuzzy rule. PA1 obtaining the errors of the respective propositions based on the inference error; PA1 if it is judged that there is a missing rule, selecting a THEN part based on the errors of the prepositions of the respective THEN parts of the fuzzy rules; and PA1 determining the selected THEN part as the THEN part of the missing fuzzy rule. PA1 a step of judging as a redundant fuzzy rule a fuzzy rule among the fuzzy rules not used for the fuzzy inference, and deleting the fuzzy rule judged as the redundant fuzzy rule. PA1 acquiring a fuzzy rule by using a procedure of tuning at least one of the membership functions and the fuzzy rule acquisition method, wherein the procedure of tuning at least one of the membership functions includes the steps of: PA1 (A) performing a fuzzy inference for at least one input value by using the fuzzy rules of the fuzzy knowledge, and obtaining a result of the execution of the fuzzy inference; PA1 (B) comparing the result of the execution of the fuzzy inference with a teaching value, and obtaining an inference error; PA1 (C) obtaining the errors of the prepositions by using the inference error; PA1 (D) obtaining a correction amount of a shape parameter of at least one of the membership functions based on the errors of the prepositions; and PA1 (E) tuning the shape of the at least one of the membership functions based on the correction amount.
For the fuzzy inference, first at Step 301 each condition part (IF part) proposition grade is computed for the inference inputs x1 and x2 (corresponding to the process by a block 401 shown in FIG. 2). Then, at Step 302 a MIN operation of the respective condition part proposition grades for each rule is performed to compute a condition part grade (corresponding to a block 402 of FIG. 2). At Step 303 the condition part grade is multiplied by an output proposition to obtain each rule grade (corresponding to a block 403 of FIG. 2). At Step 304 a MAX operation of inference output variables is performed at each section to obtain a total grade (corresponding to a block 404 of FIG. 2). An operation for the center of gravity is performed for the respective total grades to obtain a final inference output (corresponding to a block 405 of FIG. 2).
A fuzzy rule is simply called a rule hereinafter. Typical techniques of acquiring a rule are described in JP-A-3-88001 "Fuzzy PI Apparatus" (hereinafter called first conventional technique), "ARTIFICIAL.sub.-- NEURAL.sub.-- NETWORK.sub.-- DRIVEN-FUZZY REASONING", by Hideyuki TAKAGI, Isao HAYASHI, Proceedings of International Conference on Fuzzy Logic & Neural Networks IIZUKA '88, pp. 217-218, 1988 (hereinafter called second conventional technique), JP-A-4-127239 "Method of Automatically Tuning Fuzzy Inference Parameter and Method of Displaying Training Conditions" (hereinafter called third conventional technique), and JP-A-5-100859 "Fuzzy Inference Apparatus with Inference Control Mechanism and Training Method" (hereinafter called fourth conventional technique).
According to the first conventional technique, a rule is defined for an area without a rule, by using already prepared rules and membership functions. For example, it is assumed that the following five rules have been defined for fuzzy inference input variables e and e.
IF (e is positive) AND (.DELTA.e is 0) THEN (Output is positive and large).
IF (e is 0) AND (.DELTA.e is 0) THEN (Output is 0).
IF (e is negative) AND (.DELTA.e is 0) THEN (Output is negative and large).
IF (e is 0) AND (.DELTA.e is negative) THEN (Output is negative and small).
IF (e is 0) AND (.DELTA.e is positive) THEN (Output is positive and small).
In the rule description, "0", "positive", and "negative" correspond to membership functions, and "positive and large", "negative and large", "positive and small", and "negative and small" correspond to real numbers. The details of the membership functions and real numbers are omitted herein. According to the definition of fuzzy inference, after the respective total grades are obtained by the MAX operation, an algebraic sum for respective rules is computed to obtain an inference output.
Areas corresponding to these rules are shown in FIG. 3. As shown, there are some areas without a rule.
The known rules Ri0 and R0j are assumed to be given by:
where
A rule at an empty area is extended as in the following:
For example, if (i, j)=(1, 1), the rule is extended to:
In this manner, even for an area without a defined rule, a new rule can be extended by using already prepared membership functions and rules, and a proper inference output can be obtained which is equivalent to the performance of a general PI control (proportional plus integral control).
The second conventional technique concerns about a method of acquiring a rule by using a training ability of an artificial neural network (hereinafter simply called neural network). The structure of the neural network of the second conventional technique is shown in FIG. 4. Reference numeral 607 represents inference inputs, reference numeral 602 represents a neural network for computing an IF part grade (membership function) of each rule by using the inference inputs, reference numerals 603 to 605 represent neural networks for computing an output value of the THEN part of each rule by using the inference inputs.
A flow chart of acquiring a rule according to the second conventional technique is shown in FIG. 5. For the rule acquisition, at Step 701 an inference output Yi (0.ltoreq.i.ltoreq.number of outputs) and inference input Xi (0.ltoreq.i.ltoreq.number of inputs) are selected and assigned to each input/output of the neural networks. At Step 702 there is prepared a training data set including inference input data and teaching data which is a desired output.
At Step 703 the prepared training data is divided into clusters corresponding to respective rules by using a known clustering method. Assuming that the training data is divided into r clusters A1, A2, . . . , Ar, the number of rules is r.
At Step 704 the neural network 602 shown in FIG. 4 performs a learning operation for computing the IF part grade of each rule. The neural network 602 has all membership functions of the rules. The training data is given to the input/output of this neural network to perform the learning operation. The neural network training operation is performed by the back propagation method detailed, for example, in "Parallel Distributed Processing", by D. E. Rumelhurt, MIT Press, pp. 318-362. Representing the inference input data by Xi and the teaching data by Y*I as the i-th training data, applied to the input of the neural network is the inference input Xi, and used as the teaching data Wij of the neural network is: ##EQU1## where i represents the i-th training data, and j represents data corresponding to the rule j (cluster Aj).
At Step 705 the neural networks 603 to 605 shown in FIG. 4 perform learning operations for computing the THEN part output values of respective rules. The inference input data and teaching data of each cluster are given to the input and output of the corresponding neural network.
At Steps 704 to 705 the learning operations are completed at the neural network which receives the inference input and outputs the IF part grade of each rule and at the neural networks which output the THEN part of each rule. For the fuzzy inference, a fuzzy input is supplied to the neural network 602 which computes the grade Wi of each IF part, and the neural networks 603 to 605 compute the output Oi of each THEN part. The final inference output is given by the following equation: EQU Output=.SIGMA.WiOi/.SIGMA.Wi
According to the third conventional technique, the fuzzy inference is expressed by a computation network called FLIP-net (Fuzzy Logic Inference Procedure Network), and a neural network learning method is applied to this network to automatically tune the shape of a membership function.
A FLIP-net for the fuzzy inference using the following two rules is shown in FIG. 6.
Rule 1: IF (x1 is small) AND (x2 is medium) THEN (y1 is medium).
Rule 2: IF (x1 is large) AND (x2 is small) THEN (y1 is large).
Each link indicates the descriptions of each proposition and rule and the flow of inference, forming a right hand directed graph, and each node corresponds to fuzzy inference computation which includes the following five operations.
1. Computing the grade (membership function) of a proposition.
2. Computing the grade of an IF part.
3. Computing the grade of a rule.
4. Computing the total grade.
5. Computing the center of gravity (output). Fuzzy inference is performed by routing the FLIP-net from the left to the right.
The extended back propagation method (extended BP method) is the back propagation method applied to a FLIP-net, the back propagation method being a neural network learning method. The back propagation method is detailed, for example, in "Parallel Distributed Processing", by D. E. Rumelhurt, MIT Press, pp. 318-362. With the extended BP method, after the execution of fuzzy inference, an output value and teaching value are compared to calculate an output error, and the shape parameter of a membership function is modified so as to reduce the output error. Used as the correction amount of each shape parameter is a partial differential coefficient of a fuzzy inference output relative to the shape parameter multiplied by an output error.
If deliverty of a composite function is used, the partial differential coefficient of an inference output and a shape parameter takes finally a product of partial differential coefficients at respective nodes along the inference path on the FLIP-net. Specifically, in obtaining the correction amount of a shape parameter, the output error obtained by fuzzy inference is propagated back along the path from the output node to the membership function node. In back propagating an output error, the output error is multiplied sequentially by partial differential coefficients at respective nodes. The final value propagated back to a membership function is the correction amount of the membership function. It is possible to automatically tune a membership function at a high speed by computing the correction amount on a FLIP-net by the extended BP method.
An example of the structure of the fourth conventional technique is shown in FIG. 7. This fourth technique is characterized in that while a fuzzy inference is performed, an inference control unit 4405 adaptively changes the membership function shape, the rule weight representing the importance degree of each rule, and the like, in accordance with the input proposition grade. In the example shown in FIG. 7, a rule weight stored in a rule weight storage unit 4407 is changed in accordance with an input proposition grade computed by an input proposition grade computation unit 4403. The relationship between the input proposition grade and rule weight is learnt by using as the inference control mechanism a multi-layer perceptron or single-layer perceptron which is one type of neural networks. Examples of the multi-layer perceptron and single-layer perceptron are shown in FIGS. 8 and 9.
In actual learning, after the execution of fuzzy inference, the correction amount of a membership function shape parameter and the correction amount of a fuzzy rule weight are computed by using a FLIP-net and the extended BP method described with the third conventional technique. Learning is performed by giving input proposition grades to input nodes 4601 of the multi-layer perceptron of the inference control unit and by giving the correction amounts to output neurons 4604. If a three-layer neural network is used, learning is performed by the back propagation method.
The neural network and the learning method for neural network are detailed, for example, in "Parallel Distributed Processing", by D. E. Rumelhurt, MIT Press, pp. 318-362.
If fuzzy inference is executed after the neural network learning, the rule weight and the like can be adaptively changed in accordance with an input proposition grade. Fuzzy inference is therefore performed by using a fuzzy rule weight suitable for an input proposition, providing precise inference.
If a single-layer perceptron is used as the neural network and the correlative learning is used as the neural network learning, the weight 4502 of a link of the neural network after the learning indicates the intensity of correlation between an input and output. Therefore, if the input proposition grades of fuzzy inference are applied to the inputs of the neural network and the correction amounts of fuzzy rule weights are applied to the outputs, the weights of links of the neural network indicate the intensities of correlation between input propositions and rules. Accordingly, if input propositions and rules having a high correlation are selected and the input propositions are applied to the IF parts, a new rule can be generated.