The present invention relates to a method of architecting a multiple artificial neural network and an architecting system therefor, which is suitable as expert system architecting means, in particular.
A multiple neural network for realizing an expert system will be explained hereinbelow by way of example. A method of architecting an expert system most widely adopted at present is to utilize inference based upon rules between literals represented by a line of characters, respectively (Expert System Architecting Method by Tanaka et al., Published by Personal Media Co., Ltd., 1987). In this method, a great amount of expert knowledge is gathered as rules in the form of if (premise) and then (conclusion) so as to construct a rule base. Here, the premise and the conclusion are logical expressions of literals represented by a character line, respectively. In the forward inference by this system, several facts are first inputted as logical expressions of literals represented by a character line, respectively and these logical expressions are compared with if (premise)-portion. Here, the conclusion of a rule whose if-portion matches one of the facts can be determined as a newly known fact. The above procedure is repeated until several conclusions useful for the user are known as facts or until rules to be used are all used up.
The above-mentioned method involves the following two drawbacks: First, it is extremely difficult to transform expert knowledge into the form of "if" (premise) and "then" (conclusion) and further to gather the necessary and sufficient amount of such knowledge. Secondly, since the comparison between the fact and the rule premise is executed in dependence upon a strict pattern matching of character lines, in case even a single bit of the character line to be compared is erroneous, the matching will fail. Further, since vague facts are utilized, rather complicated procedures such as calculations of certainty factors, application of fuzzy concept, etc. are inevitably required.
With these facts in mind, recently it has been proposed to utilize an artificial neural network to architect an expert system, as presented in (S. I. Gallant: Connectionist Expert System, Communications of the ACM, vol. 31, No. 2, pp. 152-169, February 1988). In this proposal, a unit neural network is a network formed by mutually connecting an input section composed of a plurality of input neuron (neurons cells are referred to as cells, hereinafter), an output section composed of a plurality of output cells, and a hidden section composed of a plurality of cells, by signal lines referred to as axone (Aso: Neural Network Information Processing, Published by Sangyo Tosho, 1988). In dependence upon the unit neural network as described above, it is possible to realize an example of inference systems as follows:
Let A and B denote any given literal list, respectively: EQU A=(A1,A2 ... An), B=(B1, B2 ... Bm).
where Ai (i=1, 2, ... n) and Bj (j=1, 2, ... m) represent literals. Let us assume that A and B have no common element. That is, a literal as a B's element will not be an A's element. A literal has any one of logical value 1 (true), -1 (false) or 0 (ambiguous). The list of the values of A1, A2, ... An are referred to as an A's concrete instance, and the list of the values of B1, B2, ... Bm are referred to as a B's concrete instance. The concrete instance can be allowed to correspond to a conjunction related to the true and false of the literal. For example, one of the A's concrete instances
(1 -1 0 ... 1) represents PA1 (A1 is true) (A2 is false) (A3 is ambiguous) ... PA1 (An is true) PA1 if (a literal conjunction corresponding to A's concrete instances) PA1 then (a literal conjunction corresponding to the B's concrete instances) PA1 A1=The young is raised by milk PA1 A2=It is a mammal PA1 A3=It eats meat PA1 A4=It is a carnivore PA1 A5=Its skin is material for a samisen (Japanese musical instrument) PA1 A6=It is a cat PA1 Case 1: if (A1=1), then (A2=1), PA1 Case 2: if (A2=1) (A3=1), then (A4=1), PA1 Case 3: if (A4=1) (A5=1), then (A6=1)
where denotes a logical product. A set a.fwdarw.b of A's concrete instances a and B's concrete instances b corresponding thereto is referred to as a case, which represents a fact that if a, then b. When a plurality of these cases as described above are inputtable, it is possible to realize a logical system by allowing a unit neural network provided with n-pieces of input cells and m-pieces of output cells to learn A's concrete instances as input signal instances and the corresponding B's concrete instances as desirable output signals (referred to as teacher signals). Namely, here, the logical values of literals are variables, and the sets thereof are input and output signals to and from a unit neural network. The unit neural network serving as an inference system can execute the forward inference as follows: When any given A's concrete instance a is inputted to the input section, if a B's concrete instance b has been learned as the corresponding teacher signal, b is outputted to the output section. As described above, if a is given, b is concluded. On the other hand, if the B's concrete instance is not yet learned as the teacher signal corresponding to a, an appropriate B's concrete instance b' is constructed on the basis of the learned cases and then outputted to the output section. As described above, it is possible to obtain an appropriate conclusion b' corresponding to a for the case not directly learned. Further, if the A's concrete instance is ambiguous; that is, if the logical values of A's elements are analog values such as 0.5, for instance (not any one of 1, 0, -1), appropriate analog values of B's element literals are calculated on the basis of the learned cases to construct a concrete instance b' and outputted to the output section. Here, if the literal value is V and the possibility P that the literal is true is defined as EQU P=0.5*(V+1)*100&lt;%&gt;
the possibility P is 75% at V=0.5.
While in the conventional method, where an inference system is formed by using the above-mentioned cases and by setting the rules expressed by character lines as basis, all the available cases can be expressed as rules of the following form:
These rules are gathered to prepare the rule base and the afore-mentioned inference is to be made by use of the rule base.
In the above-mentioned conventional method, however, since concrete instances not included in the premise of the rule base cannot be matched in pattern, it is impossible to obtain a conclusion for A's concrete instances not included in the available cases, which is different from the case where a neural network is used. To overcome the above-mentioned problem, more complicated work is required such that all the necessary cases must be incorporated in the rule base or else more general rule must be discovered to reduce the number of rules. In the conventional method, further, when the A's concrete instances are ambiguous, this inference system based upon the rules expressed by character lines cannot infer, being different from the case where a neural network is adopted. To overcome these problems, complicated procedures such as calculations of the certainty factor or application of the fuzzy concept are required, as already explained.
As described above, although it is greatly advantageous to adopt a neural network as an inference system, when the inference system based upon the above-mentioned neural network is utilized in the ordinary industrial field, the unit neural network is insufficient in capacity, so that it is indispensable to use a multiple neural network as explained hereinbelow.
When an inference system is to be architected, first the available ones are generally in the form of: EQU If (A1=s1) (A2=s2) ... (An=sn), then (Am=sm)
where A=s denotes that the logical value of a literal A is s. In the sets of these cases as described above, it is unavoidable that literals existing on the right side of the cases (on the right side of "then") exist on the left side of the other cases (on the left side of then ). In this case, it is necessary to architect a dual neural network in which a unit neural network including the logical values of the literals as output signals and another unit neural network including the same as input signals are connected by a necessary axone. Further, when the above-mentioned dependence relationship between the literals extends over many stages, a multiple neural network must be architected.
With reference to FIG. 6, an example of how to architect a simple multiple neural network will be explained. Here,
In the above example, a literal A2: It is a mammal exists on the right side of case 1 and on the left side of the case 2, and further a literal A4: It is a carnivore exists on the right side of case 2 and on the left side of case 3. In this case, there are required a unit neural network 1 for learning the case 1 by inputting the logical value of A1 as an input signal and outputting the logical value of A2 as an output signal, a unit neural network 2 for learning the case 2 by inputting the logical values of A2 and A3 as input signals and outputting the logical value of A4 as an output signal, and a unit neural network 3 for learning the case 3 by inputting the logical values of A4 and A5 as input signals and outputting the logical value of A6 as an output signal. Here, an output cell for outputting the logical value of A2 of the unit neural network 1 is used in common as an input cell for inputting the logical value of A2 of the unit neural network 2; and an output cell for outputting the logical value of A4 of the unit neural network 2 is used in common as an input cell for inputting the logical value of A4 of the unit neural network 3. As described above, the entire system is a multiple neural network composed of three unit neural networks.
The above-mentioned multiple neural network has been so far architected intuitively in dependence upon human work. However, the network architecting work becomes complicated more and more with increasing numbers of cases, literals included in each case, and increasing complication of dependence relationship between the cases and the literals. Therefore, many hours are required for experts when the structure of the multiple neural network is modified markedly to add some cases to a system already in operation. Further, in case the amount of cases and the complexity exceed a limit, respectively, it may be impossible to architect the multiple neural network in dependence upon the prior-art method as described above.
With these problems in mind, therefore, the object of the present invention is to provide a method of architecting a multiple neural network and a system therefore easy to construct, irrespective of the quantity of cases and the complexity of case dependence relationship.