The invention described herein may be manufactured and used by or for the Government of the United States for all governmental purposes without the payment of any royalty.
The field of the invention is neural networks and more particularly multi-layer artificial neural networks.
Generally, a neural network consists of interconnected units, called nodes, with weighting factors applied to each interconnection. Each node receives a series of inputs, an input being a single sample of data from a larger data sample, and each input to a node is weighted. The weighted inputs are summed, a bias term is added and a transformation, called an activation function, is applied to the weighted sum in order to obtain a representative value of a minimized dimension. The activation function applied will typically be a squashing function, such as a sigmoid or the hyperbolic tangent, or can be a linear function. Intermediate layers of nodes are implemented to allow non-linear representations/mappings of the data. Often, the weights are determined by an optimization procedure called backpropagation, which is applied during the training of the artificial neural network. Training the artificial neural network causes the network to xe2x80x9clearnxe2x80x9d how to manipulate the data and is needed in order that a selected set of data may be similarly manipulated to produce an output. Backpropagation is a systematic method for training multi-layer artificial neural networks. The backpropagation algorithm, a deterministic training method, employs a type of gradient descent; that is, it follows the slope of the error surface downward, constantly adjusting the weights toward a minimum error. The error surface can be a highly convoluted surface, with many hills and valleys, and a sequence of weights can get trapped in a local minimum, a shallow valley, when there is a much deeper minimum nearby.
The most common error function used in artificial neural networks is the global measure of network energy called mean squared error. Mean square error is defined as the mean square between the actual network outputs and the desired outputs for the network, and can be defined as a function of a set of network parameters. The network parameters, or variables, are the network weights.
Neural networks, as compared to more conventional methods of function approximation, exploit to advantage an adaptive set of basis to approximate a mapping between variables. This mapping can be autoassociative, a variable mapping to itself, or it can be predictive or heteroassociative, a variable mapped to a different variable. Autoassociative neural networks, a variable mapping to itself, are in general an intermediate step within a larger data communication system.
Neural networks are limited by the xe2x80x9crobustnessxe2x80x9d of the data set over which the mappings are accomplished. The robustness can be best described in a statistical context, wherein a sampling statistic is robust if the sampling distribution is representative of the population distribution. In the case of neural networks and other function approximation methods, the prediction of an output based on a new input is termed xe2x80x9crobustnessxe2x80x9d or xe2x80x9cgeneralization.xe2x80x9d Robustness or generalization is based upon the number of points which are close to, and which hopefully encompass, the point to be predicted based upon the mapping. If there are points which are close and which encompass the point to be predicted, then robustness is regarded, in a qualitative sense, to be high.
Autoassociative neural networks (AANNs) are feedforward artificial neural networks generated from multi-layer perceptrons. An AANN is trained with the input and target data identical. In other words, an AANN maps the training data back to itself, through a specific network architecture. If the number of nodes in the AANN is smaller than the number of features in each of the training data samples, then the AANN accomplishes data reduction. A typical prior art AANN software architecture is shown in FIG. 1. In FIG. 1, a three hidden layer architecture is used for data compression, an input hidden layer at 100 and an output hidden layer at 101. The compression of layers is represented at 102. The FIG. 1 arrangement shows a three hidden layer architecture but any number of hidden layers may be used.
AANNs have been extended from processing one data set to processing two data sets, simultaneously. These are referred to as heteroassociative neural networks (HANNs) and map one set of data on the input, to a target data set that is different than the input. The HANN can effectively generate a second data set from the first set of data. However, the relationship between the two data sets is not observable and the robustness of the network is not quantifiable. The network incorporates all the interrelationships and the correlation of the features within the network and is not directly interpretable because the target data is different than the input data and there is no way to determine the accuracy of the target data. FIG. 2 is a typical prior art HANN software architecture, and is the same architecture as the AANN in FIG. 1. The HANN, as shown in FIG. 2 varies from the AANN only in that the output is different than the input. FIG. 2 shows the input at 200 with compressed hidden layers at 201 and the output at 202.
Another type of HANN is a network with both data sets as input and output. This type of HANN, a joint data HANN, provides for a mutual mapping of the two data sets. This mutual mapping provides the interrelationships that exist between the two different data sets. The joint data HANN is a mutual mapping function and therefore cannot provide a precise prediction capability. Both of the data sets must be input, to generate an accurate prediction on one or the other. FIG. 3 shows a prior art software architecture for this type of HANN. Any number of hidden layers can be applied, based on the problem application.
Conventional neural network methods currently used do not provide an ability to synthesize data from a data set in a comprehensive and verifiable manner. That is, synthesize in a manner in which robustness of the network is verifiable. The present invention solves problems in the art of autoassociative networks that only map training data back to itself and in heteroassociative networks that are unable to quantify a network""s robustness and can effectively generate a data set from first set of data, but where the two data sets are not observable.
The present invention may be used as a neurocomputing tool in, for example, process discovery as applied to thin film growth and new materials design. This invention provides a means to simulate a data set, given another data set. The predictor portion of this neural network provides an ability to simulate new material processes; this simulation provides the means to map the materials processes into a design space. In this design space, feedback and refinement to the process is possible, as is the creation of a xe2x80x98self-improvingxe2x80x99 design environment. The accuracy of this neural network is deterministic due to the incorporation of the autoassociative portion; it is this autoassociative portion that provides a statistical means to validate the performance of the Autoassociative-Heteroassociative Neural Network of the invention.
The present invention provides an efficient neurocomputing technique to analyze and predict or synthesize one data set from another, with a method to assess the generalization or robustness of the neural network operation. The method and device of the invention involves integration of autoassociative and heteroassociative neural network mappings, the autoassociative neural network mapping enabling a quality metric for assessing the generalization or prediction accuracy of a heteroassociative neural network mapping.
It is therefore an object of the invention to provide a method for efficient neurocomputing to analyze and predict one data set from another with quantifiable accuracy.
It is another object of the invention to provide a method for assessing general robustness for nonlinear mappings in neural networks.
It is another object of the invention to provide an integration of autoassociative and heteroassociative neural network mappings.
These and other objects of the invention are described in the description, claims and accompanying drawings and are achieved by a robustness quantifiable neural network capable of synthesizing two sets of output signal data from a single input signal data set, said network comprising:
an encoding subnetwork comprising:
a plurality of input signal receiving layers and nodes communicating said input signals to a projection space of said neural network, said input signals being from a source external to said neural network, a plurality of encoding nodes within said neural network input signal receiving layers forming one representative input signal;
a decoding subnetwork connected to said projection space comprising:
a plurality of output signal transmitting layers communicating output signal data from said projection space of said neural network to an output;
a plurality of decoding nodes within said output signal transmitting layers jointly transforming said input signal data set to a first predicted data set and a second data set replicating said input signal data set;
a mean square error backpropagation neural network training algorithm;
a source of training data connected to said encoding subnetwork and applied to said mean square error backpropagation neural network training algorithm as a single set on said encoding subnetwork and generating two data sets from said decoding network; and
an input signal data set and said second data set from said decoding subnetwork comparator block, said comparator block comparing accuracy of replication of said second data set from said decoding subnetwork to said input signal data set indicating robustness of said neural network.