1. Technical Field
The present invention relates to computer-implemented artificial neural networks, and more particularly, the present invention relates to computer-implemented approaches for nonlinear modeling and constructing artificial neural networks.
2. Description of the Related Art
Neural networks are predictive models that are generally used to model nonlinear processes. Most neural networks of the current approaches begin with a large input variable set and a large trial set. The traditional approach to neural network modeling is confronted with the problem of parameter overdetermination. This approach can search spaces with too many dimensions. Furthermore, the variables of the input data can be highly collinear and generate numerical estimation problems because the resulting calculations yield underdetermined approximations and rank deficient Hessian matrices describing the search directions during the optimization process. These search directions are used to optimize the performance index of the neural network. A rank deficient Hessian matrix corresponding to these search directions generally defines a state space where an objective function (or any other type of performance index) does not appreciably change with small, discrete changes to the weights and biases of the neural network. Because the objective function remains constant within this long, flat state space, the training cycle can prematurely end at a local optimum point. Furthermore, because these points are localized optimum points, the neural network may become sensitive to the starting point.
Large trial sets and large input sets also increase the required training time for a neural network. The calculation time for the neural network is based on the number of iterations, the input data size, and whether the Hessian matrix is of full rank. Because the input size is a function of the number of trials and the number of input variables, training becomes a tradeoff between introducing more input variables and trials and time that is put into training. Since each iteration takes at least one run through the entire data set, the computer time needed for solving the estimation problem depends upon where the data set is stored: in core memory (RAM) or on file (hard drive). For large data sets the traditional neural network algorithms are forced to keep the data on file which means slow read access during each run through the data. Furthermore, neural networks are generally not tested across different network structures and different activation functions because changing the structure or the activation functions generally requires retraining the entire neural network. The large input size makes testing these criteria time consuming.