Neural Networks have been used for many years to solve certain types of computational problems. Especially amenable to solution by neural networks are problems where patterns of data or information need to be discovered or modeled. Neural networks and their applications are thus well known and are described by D. E. Rumelhart and J. L. McClelland in Parallel Distributed Processing, MIT Press, 1986.
Essentially, a neural network is a highly parallel array of interconnected processors that has the ability to learn and discover patterns from data that is input to the network. When the form of the input data has no meaning of its own, such as data contrived simply for the academic purpose of observing the network perform, the network operates well. However, when the form of the input data has meaning of its own and thereby presents a problem to the neural network, the solution of the problem by a neural network includes discovering the meaning of the input data form. The problem of presenting data to the input of a neural network in a form that minimizes the time needed and the problems associated with the network's discovering the meaning of the form of the input data has been referred to as "the encoding problem."
For numeric data, many encoding techniques have been used. Simple binary encoding has proven especially inefficient and difficult. With a binary encoded input, up to 90 percent of computational time of the neural network is spent learning the rules of binary number theory, e.g., that bit 1 is next to bit 0 and has value that is exactly twice as much. Nonetheless, binary is used because of the optimal density with which binary can represent information. For example, only 16 bits are needed to represent 65,536 different input values.
Temperature scale encoding has been used to overcome problems associated with binary encoding. With temperature scale encoding, successively higher input values are represented by causing successively higher bits to change state. For example, 0 is represented as 000000, 1 is represented as 000001, 2 is represented as 000011, 3 is represented as 000111, and so on. The meaning of the form of temperature scale-encoded input data is more easily recognizable by the neural network than that of a binary-encoded input, but temperature scale encoding has several drawbacks. One significant drawback is that 65,536 bits are needed to represent the same number of input values that can be represented by 16 binary bits. Thus temperature scale encoding has a low density of information content, and such a low density code necessitates an inordinately large neural network size to process a given problem. In addition, the temperature scale code provides no information regarding bit adjacency, so the neural network must spend a significant amount of time learning that bit 7 is next to bit 6, for example. Finally, since a temperature scale code results in more bits being logic ones for higher input values, higher input values result in a greater energy level applied to the input of the network. Despite this, the neural network must learn that all possible input values are equally important, whether high or low.
Some of the problems associated with the temperature scale code are obviated by using a grey scale encoding technique. With a grey scale code, successively higher bits alone represent successively higher input values. For example, an input value of 0 is represented as 000000, 1 is represented as 000001, 2 is represented as 000010, 3 is represented as 000100, and so on. Like a temperature scale code, the meaning of a grey scale code is more easily recognizable by the neural network than is a binary code. Furthermore, the same energy is presented to the network no matter what the input value is, so the network does not have to learn that each input value is as important as any other. However, grey scale code density is just as low as that of temperature scale encoding, necessitating a very large neural network to solve the simplest of problems. In addition, like temperature scale encoding, a grey scale code provides no information regarding bit adjacency.
A balanced binary code has been used to take advantage of the density of a binary code, while providing a constant energy input regardless of the input value. For instance, a 10-bit balanced binary code sequences like binary, but there are always 5 bits turned on and 5 bits turned off. Balanced binary code has the same drawbacks as the simple binary code referred to above.
Typically, any code, such as those discussed above or others, is chosen on an ad hoc basis depending upon the nature of the problem to be solved. No code has proven optimal for all problems, and the selection of a code or the development of an encoding process has proven troublesome, time consuming, and wasteful of computing resources. In addition, the magnitude of the encoding problem is increased since the output of a neural network must be appropriately decoded. The decoding processes which have been used, being merely the inverses of the encoding processes discussed above, suffer from the same inefficiencies and further limit network performance.