1. Field of the Invention
This invention relates generally to artificial neural networks, and, more particularly to a digital artificial neural network that uses less memory. The memory reduction is achieved by storing a sample transfer function in a memory and mapping the sample transfer function to a network node to achieve an output value ior the node.
2. Description of the Related Art
Artificial Neural Networks ("ANN") process information in a manner similar to the human brain. An ANN typically consists of input nodes, output nodes, and, in most cases, hidden nodes. A node is simply a data processing device capable of receiving multiple inputs and generating a single output based on those inputs. In a typical ANN this means the input nodes receive one input, hidden nodes receive several inputs, and output nodes receieve several inputs. The hidden nodes derive their name because they do not receive any input signals from sources outside the ANN, nor do they output signals to any devices outside the ANN and, thus, they are hidden from the universe existing outside the ANN.
FIG. 1 shows a basic ANN 100 consisting of input nodes 2, hidden nodes 4, and output nodes 6. Arrows 8 connecting the different nodes represent the direction of information flow through ANN 100. FIG. 1 shows information flowing from input nodes 2 to hidden nodes 4 and from hidden nodes 4 to output nodes 6. As can be seen, all the information in ANN 100 flows from input nodes 2 to output nodes 6 and no information flowvs in the opposite direction. This is generally known as a feedforward network. As represented by the dashed arrow 10, some feedforward networks exist where input node 2 may be connected directly to output node 6. Other types of ANNs include feedback loops. For example, ANN 200 shown in FIG. 2 contains multiple layers of hidden nodes 4. As showvn by arrow 12, hidden node 24, in the second hidden layer, feeds information back to the input of hidden node 22, in the first hidden layer, forming a feedback loop in the network.
FIG. 3A shows an individual node 32 of an ANN. Node 32 receives a plurality of inputs 34.sub.1 to 34.sub.n and generates a single output 36. The process for generating an output signal from node 32 uses a transfer function. Equation (1) represents a typical transfer function: EQU F.sub.output =f(34.sub.1, 34.sub.2, 34.sub.3, . . . 34.sub.n)(1)
In other words, output 36 is a function of all inputs 34.sub.1 to 34.sub.n. Nodes in lower levels provide inputs to nodes in higher levels. Each node in a higher level, however, does not necessarily receive an input from all the nodes in lower levels. For example, node 32 may only receive an input from every even node of the preceding level. Such, a node may be represented by the following equation: EQU F.sub.output =f(34.sub.2, 34.sub.4, 34.sub.6, . . . 34.sub.n)(1a)
While several different variations on transfer functions are possible, a common transfer function is known as a sigmoid function. FIG. 4 is a graphical representation of a sigmoid function 42 defined by equation (2). EQU F.sub.output =1/(1+e.sup.y) where y=.SIGMA.(W.sub.i X.sub.i) or y=.SIGMA.(W.sub.i -X.sub.i).sup.2 (2)
As can be seen from FIG. 4, sigmoid function 42 is a nonlinear function where F.sub.output approaches a constant value, .+-.saturation, as the sum of the inputs y approach .+-..infin.. F.sub.output never actually reaches either .+-.saturation; however, at the corresponding .+-.threshold value the difference between the F.sub.output and .+-.saturation is below an acceptable tolerance such that it is beneficial to define F.sub.output as equal to .+-.saturation.
FIG. 3B represents how node 32 divides equation (2) into two parts. Subnode 32a performs the first part and subnode 32b performs the second part. Subnode 32a receives inputs 34.sub.1 to 34.sub.n from the previous layer's nodes (not shown). A summer in subnode 32a then multiplies each input 34.sub.i by its associated weight factor W.sub.i and adds the weighted inputs 34.sub.i together to generate an intermediate value y or weighted input y. Intermediate value y is then transferred to a processor in subnode 32b. Subnode 32b processes the intermediate value y according to equation (2) and generates output signal 36. Every node 32 in an ANN has its own transfer function; however, the individual transfer functions are commonly designed to be of the same general category.
Weight factor W.sub.i corresponds to the importance of a particular input 34.sub.i (for i=1 to n) in determining output 36 of node 32. ANN 100 learns by changing weight factor W.sub.i of inputs 34.sub.i. One method of changing weight factor W.sub.i is through a supervised learning alogrithm. A supervised learning algorithm learns by supplying external test signals to ANN 100. These external test signals have known outputs. If the outputs are not achieved, ANN 100 corrects the weight factors W.sub.i until the known output is achieved within a certain error or tolerance.
Several methods of adjusting weight factors W.sub.i exist. One such method is generally known as gradient-descent or backpropagation. Backpropagation works by inputting known quantities into ANN 100 to achieve output 36. As illustrated in FIG. 5, output 36 is inputted into a comparator 52. Comparator 52 also receives a desired output 54. Desired output 54 is the value output 36 should have based upon known input quantities. Comparator 52 compares outputs 36 and 54 and generates a weight factor adjust signal 56 that is used by the summer in subnode 32a in an iterated process to adjust weight factors W.sub.i until the error between output 36 and desired output 54 is minimized. For a thorough discussion of Artificial Neural Networks see Simon Haykin, Neural Networks: A Comprehensive Foundation, Macmillan College Publishing Co., 1994.
The above described ANN 100 can be implemented using either analog or digital techniques. A exemplary digital ANN 102 is illustrated in FIG. 6A. As shown, information processors 62a-c, 64a-c, and 66a-c represent various nodes of digital ANN 102. Processors 62a, 62b, and 62c represent input nodes. Processors 64a, 64b, and 64c represent hidden nodes. Processors 66a, 66b, and 66c represent output nodes. In this example, the transfer function, i.e., equation (1), is stored in a central transfer function memory 68 accessible by each of the processors. Alternatively, the transfer function may be resident in a memory (not shown) of each processor. FIG. 7 illustrates a graphical representation of a digital transfer function 70. Digital transfer function 70 consists of a series of sample points X of analog transfer function 42 stored in memory 68. One digital transfer function 70 is stored in memory 68 for each information processor 62a-c, 64a-c, and 66a-c of digital ANN 102.
FIG. 6B is a block diagram of a typical processor 82 for digital ANN 102, such as would be provided for any of processors 62a-c, 64a-c, and 66a-c. Processor 82 receives inputs 84.sub.1 to 84.sub.n, accesses transfer function memory 68, and determines output 86. For a complete description of Digital ANNs see U.S. Pat. No. 5,204,938, issued to David M. Skapura and Gary J. McIntire, "Method of Implementing a Neural Network on a Digital Computer." In current digital implementations, however, a digital transfer function is stored for each processor. The capacity of transfer function memory 68, therefore, must be large enough to store one transfer function for each processor of digital ANN 102. Alternatively each processor must store a particular transfer function in a resident memory. This requires significant memory resources for large scale digital ANNs, that in turn increases the cost of implementing large scale digital ANNs. Thus, it would be beneficial to provide a method that reduces the storage capacity required by digital ANNs and still allow effective and efficient implementation of the digital ANN.