1. Field of the Invention
The present invention relates to a trained neural network analysis method and to an apparatus for executing the analysis method.
2. Description of Related Art
(1) Neural Networks
A neural network is an artificial neural circuit network that emulates the operation of a neural circuit network of a human using a computer. In a neural network, there is one or more intermediate layers between a data input part and a data output part, each of these layers being made up of a plurality of units, network-like connections being made between the input/output sections and the intermediate layers by means of the input/output systems. Because this neural network has non-linear components, it is capable of performing extremely complex approximations with respect to a variety of data types. Because of this, neural networks are currently used in many industries, including manufacturing and service industries. In these applications, a various types of data are input to the neural network for training, this being used in such applications as character recognition, image recognition, and prediction.
When a neural network is trained, weighting coefficients and biases are randomly applied with respect to the input data for each of units that accepts data. As data is input under these conditions, judgments are made with regard to the correctness of the output resulting from calculation according to these weighting coefficients. Whether or not the output results are correct is fed back using a learning method such as back-propagation, the originally set weighting coefficients and biases being corrected, and data being re-input. By repeating this process of input and correction of weighting coefficients and biases a large number of times, the weighting coefficients and biases that will obtain an appropriate output for a prescribed data input are established. In this case, the weighting coefficients and biases of each unit of the trained neural network are expressed by a function such as a sigmoid function. By installing such a trained neural network into a character recognition, image processing or other system that is implemented by a computer, the neural network can be put into practical use.
(2) Problems Associated With Trained Neural Networks
In designing a neural network such as noted above, the establishment of how each of the units making up the neural network, that is, the input layer, the intermediate layer and the output layer, are to be placed is done empirically, based on input data and the output data that is to be used. For this reason, with regard to whether or not each unit is redundant, and whether or not there are insufficient units must be predicted empirically from the accuracy of learning results and the learning efficiency. For this reason, even if there is a redundancy, because it was not possible to make a judgment as to whether or not there is a redundancy, installation in the system was necessary, as long as the learning efficiency is high, thereby leading to an increase in the computer memory capacity and to a decrease in processing speed. In the case of a neural network with a bad learning efficiency, there was a need to re-design the overall neural network according to empirical rules, thereby preventing the use of learning results.
In the previous art, however, even if it was known that an output would be obtained in response to learning results by inputting prescribed data to a trained neural network, it was not possible by looking at just the sigmoid function after training to judge how the neural network itself was operating, and to judge what role each of the units was playing in the neural network. Essentially, to make the operation of the neural network understandable to a human, it is necessary to express the behavior of the individual units as propositions that are close to natural language. In the above-noted trained neural network, however, the sigmoid function which represents the individual units of the neural network is expressed in terms of weighting coefficients and bias values, and it is not possible to distinguish just what these mean in the neural network.
The present invention was made to solve the above-described drawbacks of the previous art, and has an object the provision of a neural network analysis method and apparatus which approximates a multilinear function (such as a sigmoid function) which represents each of the units of a neural network with the nearest Boolean function, analyzing the meaning of each unit of a trained neural network, so as to express it as a proposition that can be understood by a human.
The present invention has as another object the provision of a neural network analysis method and apparatus which, when approximating multilinear functions that represent the units of a neural network, perform approximation over only the learning data domain, thereby not only obtaining high-accuracy Boolean functions, and also reducing the amount of calculations and shortening the processing time.
To achieve the above-noted objects, the present invention is a neural network analysis apparatus comprising: input means for inputting a multilinear function that represents each hidden unit and an output unit of a trained neural network to be analyzed; a function-extracting apparatus, which approximates a multilinear function that represents each unit with a Boolean function, said function extracting means being provided in accordance with each hidden and output unit of said neural network; and a Boolean function-synthesizing apparatus for synthesizing Boolean functions, which synthesizes Boolean function obtained by each function-extracting apparatus, said function-extracting apparatus comprising a term generator that generates each term of said Boolean function, and a term-linking apparatus that links terms generated by said term generator using a logical sum, and said term generator having a data-limiting apparatus that limits learning data to a domain corresponding to a term that judges whether said term exists in a Boolean function after approximation.
In this aspect of the present invention, because the units of a neural network are represented by Boolean functions which are abstract classical logic propositions from natural language, the resulting propositions are easy for a human to understand, thereby providing an understanding of which unit has learned which proposition or concept. Furthermore, because the Boolean functions used in the approximation are the closest in the learning data domain, predicted values are not included, resulting in highly accurate Boolean functions.