Recent developments in computational technology and techniques have applied computers to accumulate and recognize patterns as well as to respond to random or quasi-random events.
Computers have been implemented for their pattern recognition capabilities to increase the efficiency of response oriented service environments such as inventory systems, telemarketing campaigns, financial management, and service sector operation; i.e. environments which respond to changing situations. Some systems are amenable to predictive techniques implementing Markov processes and queueing theory, in which a current state of the system determines a successive state of the system, independent of the history of the system. Thus, the history of the system is not employed and so is not retained for the purposes of predicting a successive state.
Other systems behave in a cyclical manner, i.e. regular cyclical patterns are observable in the history of the system, which may then be employed to predict subsequent states of the system.
Many cyclical systems are non-deterministic despite their regular cyclical trends, i.e. random variations are expressed in weather patterns and temperatures. Therefore, such non-deterministic cyclical systems are difficult to predict. Since service providers are affected in part by weather and temperature in a given region, the development of a predictive system to approximate the response time of the service provider to customer service requests would increase the efficiency of service responses to thus improve the performance of the service providers.
Traditional prediction systems rely on explicitly stated rules which attempt to indirectly explain or describe the behavior of data. These rules are often implemented into the prediction system by a programmer and applied to input data to generate an output using these rules. However, data may have subtle and/or unknown relationships not adaptable to explicitly stated rules. In addition, since data is often noisy, distorted, or incomplete, explicitly stated rules may fail to correctly operate on patterns broadly similar to the data from which the rules were drawn. Also, some complex problems are non-linear, so they cannot be easily handled by mathematically simple rules.
The implementation of computational systems known as neural networks to non-linear and/or non-deterministic environments allows patterns in data from such environments to be recognized and successive states to be predicted without the above limitations of traditional prediction systems.
Neural networks do not rely on explicitly stated rules since such neural networks process input data and learn their own rules from the input data to generate accurate outputs. Therefore, neural networks are able to find unknown data relationships, to generalize to broad patterns with incomplete or noisy input data, and to handle complex non-linear problems. Many of the basic characteristics of neural networks known in the art are described in "Working with Neural Networks", D. Hammerstrom, IEEE SPECTRUM, July 1993, pp. 46-53, which is incorporated herein by reference in its entirety.
In general, a neural network comprises a set of processing elements (PE) or nodes which are modelled to perform as a neuron behaves in a brain. As shown in FIG. 1, a neuron 2 comprises a soma 4 having a plurality of dendrites 6 as inputs. An axon 8 extends from an axon hillock 10 and branches to form a plurality of parallel outputs operatively coupled by synaptic junctions 14 to the dendrites of the other neurons. Once a sufficient degree of input electrical signals conveyed by ionic concentrations input through the dendrites 6 of neuron 2 attains a threshold level, the soma 4 fires to output an electrical signal over its axon 8. The nodes or processing elements of the neural network function to output a signal once the sum of inputs attain a threshold value. Hence, the term `neural network` is applied to such processing elements.
As illustrated in FIG. 2, an artificial neuron-like node 16, artificial neuron, or processing element has at least one input 18 and at least one output 20. The output is determined from the inputs by weighting each input by multiplying the corresponding input with weight 22, using an adder 24 to sum the weighted inputs with a bias 26 or threshold input, and generating at least one output from a transfer function 28 of the weighted sum. The weights 22 may be dynamically altered, as described below. A non-linear transfer function 28 may be used to smooth the raw sums within fixed limits. A popular transfer function is the sigmoid function EQU y=(1+e.sup.-Q(x)).sup.-1
shown in FIG. 3 where Q is a function of x. Other functions such as the hyperbolic tangent function, scaled and translated as shown in FIG. 4, may be used as transfer functions.
The nodes may be interconnected in a basic neural configuration as shown in FIG. 5, having a set of input nodes called an input layer 30, a set of output nodes called an output layer 32, and a set of intermediate nodes called a hidden layer 34 connecting the input layer 30 to the output layer 32. The input nodes may be passive, i.e. they pass input data unchanged through to the hidden layer 34, while the hidden layer 34 and output nodes 32 are generally active in modifying and processing data.
Lacking explicitly implemented rules, the nodes of the neural network are assigned a predetermined bias and initial weights. The neural network is then reconfigured by adjusting the weights to each node by training the neural network. Training a neural network by supervised learning involves inputting a known set of inputs, processing the inputs through the hidden layer, obtaining the resulting set of outputs from the neural network, comparing the resulting outputs with a known set of outputs corresponding to the known set of inputs, adjusting the weights of each node based on a comparison of the resulting and known outputs, and repeating the training until the neural network obtains weights which would generate the known outputs within a required degree of error. The neural network thereby learns to generate the known outputs from the known inputs, and then may be used for generating outputs from unknown inputs in use in the field. In this manner, neural networks are adaptive since they are reconfigured during training and during actual use to learn new rules or to find new patterns in new data.
One of the more popular configurations of neural networks (NN) is the back propagation (BP) model 36 shown in the block diagram in FIG. 6. In some BP neural networks, the outputs of the basic neural network 38 are connected to a root mean squared (RMS) error generator 40 which calculates the root mean squared error from the respective neural network's outputs. The root mean squared error is then fed back to the weights 22 of each node, where a weighted fraction of the fed back root mean squared error is determined to find the indirect contribution of each node to the root mean square errors. The diagonal arrow 42 in FIG. 6 symbolizes that the error signal is fed back to each weight 22 of each node throughout the neural network 38. The weighted fraction of errors is used to adjust the weights 22 of each node, and subsequent reconfiguration of the weights 22 during the training period minimizes the root mean squared error.
As shown in FIG. 7, in an idealized depiction of a surface 44 of error values, the changes in the weights adjust the weights to reduce the error toward a minimum 46 by gradient descent. As depicted in FIG. 8, the training is repeated for many iterations until the error is reduced below a predetermined error tolerance limit 48. It is common for hundreds and even thousands of iterations of the known input and output data to be performed until the neural network functions within the tolerance limit 48 to be considered adequately trained.
Each node may be embodied as a storage register in memory for storing individual node information, including the weights, bias, and node identification data. Software packages are presently available for accepting input data in preselected formats to implement neural networks. NeuralWorks.TM. Professional II/PLUS from NeuralWare Inc., Pittsburgh, Pa. is a menu and window driven system for neural network applications. For example, using a variety of windows of a graphical interface such as shown in FIG. 9, the transfer function 50, the number 52 of nodes or processing elements on each of the input, hidden, and output layers, the learning rule 54 to be used, etc. may be set up.
Alternatively, each node may be a specialized processor with memory for storing the individual mode information. Since neural networks may perform under a parallel processing architecture, massively parallel processing systems having such specialized processors connected in parallel are well suited for neural network applications.
The field of fuzzy logic is related to neural networks in that both can handle non-linearities in environments and allow for interpolative reasoning. Fuzzy logic deals with imprecision by expressing degrees of inclusion of an object or function in a set by a membership function ranging from 0 to 1. Linguistic rules may thus be implemented to express height as `short` or `tall` both using fuzzy sets in fuzzy logic and manipulating these fuzzy sets. A crisp or definite non-fuzzy result is obtained by the Center of Gravity (COG) method to find the centroid or center of gravity of the fuzzy sets, as described in Handbook of Intelligent Control, D. White and D. Sofge, Ed., Multiscience Press, New York, 1992, which is incorporated by reference in its entirety.
Hybrid systems employing both neural networks and fuzzy logic allow fuzzy control systems implementing human understandable expressions to be adaptable during performance using the learning capabilities of a neural network.
As described in Handbook of Intelligent Control above, a hybrid system 56 as shown in FIG. 10 called the Approximate Reasoning based Intelligent Control (ARIC) architecture integrates fuzzy logic controller 58 having a fuzzifier 60, a rule base 62, and a defuzzifier 64 with a neural network 66 to apply unsupervised learning to the neural network 66. In ARIC, the output layer of the neural network employs reinforcement learning to predict, for example, a failure in a plant 68.
In service environments, the response time to service requests is affected by the geographic area served as well as by quasi-cyclical events such as the weather and the time of year. Other random factors such as manpower available and the number of requests also affect the response times of the service provider.
For customers to be served efficiently and for maintaining reliable service in general, it would be advantageous for the service provider to reliably predict response times for a given day of any month. As a service request is called in by a customer, the service operator of the service provider may then provide an accurate time to the customer when service personnel should respond to the request by, for example, maintenance or other services as needed.
Generally, bad weather due to changes in climate and the occurrence of peak periods should increase the number of service requests. Similarly, geographic regions with larger populations would be expected to have more requests for service compounded upon the effects of weather.
A greater manpower available at a given time would be expected to decrease response times. The requests for service also depend on the nature of the service request, e.g. emergencies in service or services of lower priority are addressed in a different manner by service providers. However, since all of the above factors have an inherent randomness despite cyclical trends, the requests for service may behave in a non-linear fashion.
A need exists for a system and method for accurately predicting response times of a service provider to such non-linear factors affecting service responses. Since artificial neural networks implemented by a computer are capable of handling these non-linearities, it would be beneficial to apply an artificial neural network to the service environment for predicting service response times.
In a given month, the weather conditions and other factors may vary from year to year. A current month such as December 1993 may overall be `colder` or `warmer` than December 1992, so December 1993 may be characterized to be `similar` to January 1994 or November 1993, respectively. Predictions for response times for December 1993 based on December 1992 would thus not be as accurate as predictions based on a more `similar` month.
A need exists for a response time predictor to be adaptable to different conditions as well as to be able to predict response times using such fuzzy characteristics as `colder` and `warmer`.
The present invention is a hybrid system incorporating a fuzzy logic classifier and a neural network predictor trained by historical service records of a service provider to predict response times for days and times of the month, e.g., peak periods. The present invention is adaptable to learn from new conditions of weather, manpower and response times, and also takes the `similarness` of conditions in previous years into account.