The present invention relates generally to computer-implemented artificial intelligence systems. More particularly, the present invention relates to computer-implemented neural networks and expert systems.
Despite their success in diverse application domains for difficult classification tasks, feedforward networks are still viewed as black boxes by many users. This is due to the fact that the internal representation learned by a feedforward network for a classification task is not transparent, i.e. the knowledge embodied in network's numerical weights is not easily comprehensible. According to Minsky, this problem of internal representation of a trained network being opaque is a crucial problem as many real world applications often require multiple representation schemes for acquired knowledge (Minsky, M. "Logical versus analogical or symbolic versus connectionist or neat versus scruffy," AI Magazine, Vol 12, pp 34-51, 1991). Furthermore, the users of neural networks are unable to gain better understanding of a classification task learned by a network when the knowledge acquired by a neural network through training remains incomprehensible.
To alleviate the problem of opacity, several schemes in the literature have been suggested. These schemes fall into one of two categories. The first category consists of schemes as discussed for example by Fu and by Gallant that attempt to explain the knowledge stored in connection weights of a trained network by a set of rules for explaining the output in terms of inputs (see, Fu, L. "Rule generation from neural networks," IEEE Trans. Systems, Man, and Cybernetics, Vol 24, pp 1114-1124, 1994; and Gallant, S. I. "Connectionist Expert Systems," Comm. ACM, Vol 31, pp 152-169, 1988). While these schemes make the transference of knowledge embodied in a network into another representation possible, they fail to bring out the internal representation learned by hidden units by stressing only the input-output behavior.
The second category of schemes are more focused on hidden units and try to bring out the features developed by internal units as a result of training. The Hinton diagram, which relies on a pictorial representation of connection weights to represent the concepts learned by internal neurons is an example of this category (see, Hinton, G. E. "Connectionist learning procedures," Artificial Intelligence, Vol 40, pp 185-234, 1989).
The methods described by Saito and Nakano, as well as by Fu are based on a heuristic search which is conducted separately for positive and negative weights. (see, Saito, K. and Nakano, R. "Medical diagnostic expert system based on PDP model," Proc. IEEE Int'Conf. Neural Networks, Vol I, pp 255-262, 1988; and Fu, L. "Rule generation from neural networks," IEEE Trans. Systems, Man, and Cybernetics, Vol 24, pp 1114-1124, 1994). Moreover, these methods have a requirement of a set of training examples along with numerical weights in order to uncover the knowledge stored in a network.
The present invention overcomes these and other disadvantages found in previous approaches. The present invention extends the symbolic mapping to a multiple-valued logic (MVL) representation of a neuron. This extension to MVL representation is desirable due to several factors. First, multivalued inputs, e.g. the eye color attribute--brown, black, and blue, or the status of blood pressure--low, normal, and high, are natural in many applications, and thus MVL representation is the most appropriate representation. Second, MVL representation can be easily used to deal with continuous inputs by multilevel quantization. Third, multivalued logic provides a compact representation even for neurons with binary inputs by grouping several binary features into a single multivalued feature. Moreover, the present invention symbolic mapping process uncovers an algorithm learned by a network without worrying about the set of training examples that might have been used.
In accordance with the teachings of the present invention, a computer-implemented apparatus and method is provided for generating a rule-based expert system from a trained neural network which is expressed as network data stored in a computer-readable medium. The rule-based expert system represents an interconnected network of neurons with associated weights data and threshold data. A network configuration extractor is provided for accessing the network data and for ascertaining the interconnection structure of the trained neural network by examining the network data.
A transformation system is utilized to alter the algebraic sign of at least a portion of the weights data to eliminate differences in the algebraic sign among the weights data while selectively adjusting the threshold data to preserve the logical relationships defined by the neural network. A symbolic representation generator applies a sum-of-products search upon each neuron in the network to generate a multivalued logic representation for each neuron. A propagation mechanism combines the multivalued logic representation of each neuron through network propagation to yield a final logical expression corresponding to a rule-based expert system of the trained neural network.
The present invention utilizes the following expressions in order to accomplish the aforementioned operations. If x.sub.i is a multivalued variable which takes any value in the set P.sub.i ={0,1, . . . p.sub.i -1}, then for any subset S.sub.i.OR right.P.sub.i, x.sub.i.sup.S.sup..sub.i is a literal of x.sub.i representing the function such that: ##EQU1##
When S.sub.i.ident.P.sub.i, the value of the literal is always 1, and when S.sub.i.ident..O slashed., the value of the literal is always 0. The complement of the literal, denoted as {overscore (x)}.sub.i.sup.S.sup..sub.i , is defined as EQU x.sub.i.sup.S.sup..sub.i =x.sub.i.sup.P.sup..sub.i .sup.-S.sup..sub.i
Additionally, a product of literals x.sub.1.sup.s.sup..sub.1 x.sub.2.sup.s.sup..sub.2 . . . x.sub.n.sup.s.sup..sub.n is termed within the present invention a product. A multiple-valued input, binary-valued output function or simply a Boolean function is a mapping according to the following expression: ##EQU2##
where P.sub.i ={0,1, . . . ,p.sub.i -1} and B={0,1}. A Boolean function expressed as a sum of product form is said to be in sop or disjunctive normal form (dnf).
The following expression for .function. is an example of dnf representation with three variables x.sub.1,x.sub.2, and x.sub.3 with P.sub.1 ={0,1}, P.sub.2 ={0,1,2}, and P.sub.3 ={0,1,2}. EQU .function.=x.sub.1.sup.0 x.sub.2.sup.{1,2} {character pullout}x.sub.1.sup.1 x.sub.2.sup.0 x.sub.3.sup.{0,1}
In the following fashion, .function. can be written using complement of the second literal in the second product term. EQU .function.=x.sub.1.sup.0 x.sub.2.sup.{1,2} {character pullout}x.sub.1.sup.1 x.sub.2.sup.{1,2} x.sub.3.sup.{0,1}
The truth table for function .function. is shown in Table 1.
TABLE 1 x.sub.1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 x.sub.2 0 0 0 1 1 1 2 2 2 0 0 0 1 1 1 2 2 2 x.sub.3 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 f 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
A multiple-valued input, multiple-valued output function is a mapping ##EQU3##
where P.sub.i ={0,1, . . . ,p.sub.i -1} and R={0,1, . . . ,r-1}. A multiple-valued input, multiple-valued output function is expressed through r multiple-valued input, binary output functions, one function for each output state.
The multiple-valued logic provides a compact notation even for applications that essentially deal with binary logic. This feature of multivalued logic is used in designing programmable logic arrays with decoders. For example, consider the following function of three binary variables x.sub.1, x.sub.2, and x.sub.3 : EQU .function.=x.sub.1 x.sub.2 {character pullout}x.sub.2 x.sub.3 {character pullout}x.sub.3 x.sub.1
In this example, a multivalued variable y is defined which takes any value in the set {0,1,2,3} in such a way that value r implies r of the variables x.sub.1, x.sub.2, and x.sub.3 are true. The above function .function. can then be expressed in MVL as: EQU .function.=y.sup.{2,3}
which is a compact representation in comparison with binary logic representation. Furthermore, this representation brings out the condition that .function. is true if two or three of the inputs are true more clearly that the binary logic representation.
For a more complete understanding of the present invention, its objects and advantages, reference may be had to the following specification and to the accompanying drawings.