The present invention relates, generally, to neural networks and, more particularly, to a method and circuit for implementing automatic learning using the k nearest neighbor (KNN) mode (or algorithm) in artificial neural networks.
Artificial neural networks (ANNs) are used with increased frequency in applications where no mathematical algorithm can describe the problem to be solved. Moreover, they have proven to be highly successful in classifying and recognizing objects. ANNs give excellent results because they learn by example and are able to generalize in order to respond to an input vector that was never present before. Thus far, most ANNs have been implemented in software and only a few in hardware. When implemented in software, no automatic learning is possible. This is one of the reasons why the tendency to date is to implement ANNs in hardware, typically in semiconductor chips. In this case, hardware ANNs are generally based on an algorithm known in the art as Region of Influence (ROI). The ROI algorithm gives good results if the input vector presented to the ANN can be separated into classes of objects well segregated from each other. When an input vector is recognized by neurons belonging to two different classes (or categories), the ANN will respond with an uncertainty. By way of example, FIG. 1 shows two prototypes A and B with their respective actual influence fields (AIF) and categories xe2x80x98axe2x80x99 and xe2x80x98bxe2x80x99 in a two-dimensional feature space. As apparent in FIG. 1, an input vector V falling in the hatched zone cannot be classified according to the ROI approach during the recognition phase because it is recognized by two prototypes that belong to different classes. In contradistinction to the K Nearest Neighbor (KNN) approach, an input vector V closer to prototype A will be assigned a class xe2x80x98axe2x80x99. When operating in a KNN mode, the uncertainty is limited to a line, as depicted in FIG. 1 instead of a surface, represented by the hatched zone.
Several neuron and artificial neural network architectures implemented in semiconductor chips are described in the following related patents:
U.S. Pat. No. 5,621,863 xe2x80x9cNeuron Circuitxe2x80x9d, issued on Apr. 15, 1997 to Boulet et al.;
U.S. Pat. No. 5,701,397 xe2x80x9cCircuit for Pre-charging a Free Neuron Circuitxe2x80x9d, issued on Dec. 23, 1997 to Steimle et al.;
U.S. Pat. No. 5,710,869 xe2x80x9cDaisy Chain Circuit for Serial Connection of Neuron Circuitsxe2x80x9d, issued on Jan. 20, 1998 to Godefroy et al.;
U.S. Pat. No. 5,717,832 xe2x80x9cNeural Semiconductor Chip and Neural Networks Incorporated Thereinxe2x80x9d, issued on Feb. 10, 1998 to Steimle et al.; and
U.S. Pat. No. 5,740,326 xe2x80x9cCircuit for Searching/Sorting Data in Neural Networksxe2x80x9d, issued on Apr. 14, 1998 to Boulet et al.;
all of which are incorporated herein by reference.
The ROI learning mode can be advantageously implemented in chips known as ZISC chips (ZISC is an IBM Corporation Trademark), because they incorporate a specific circuit, i.e., xe2x80x9cDmin determination circuitxe2x80x9d, also referred to as a xe2x80x9cminimum circuitxe2x80x9d. Normally, the minimum circuit is designed to compute the minimum distance between the input vector and the prototypes stored in the neurons. Moreover, it is also adapted to identify which neuron computes the minimum distance.
The following description will be made in the light of the aforementioned U.S. patents, wherein the same terms and names of circuits will be kept whenever possible.
Several ZISC chips can be connected in parallel in order to reach the number of neurons needed for a given application defined by the user. All the neurons of the ANN compute the distance (e.g., the Manhattan distance) between the input vector to be recognized or learned and the prototypes stored in a Read/Write memory, typically a local RAM (Random Access Memory), implemented in each neuron.
FIG. 2 schematically shows a few neurons as part of an ANN, referenced 10, and which illustrates the essence of a conventional ZISC chip architecture. Referring more specifically to FIG. 2, neurons 11-1 to 11-N are fed in parallel by way of input bus 12 to enable communication with each other and with the external world. This is made possible through communication bus 13. The latter terminates at the chip boundary, namely, at open drain drivers to make it possible, by dotting all chips, to extend the neural network from the chip to a card. Let it be assumed that neurons 11-1 and 11-2 are the last two active (i.e., engaged) neurons of the ANN, and 11-3, the third neuron, is the first inactive (i.e., free) neuron thereof, i.e., not yet engaged by learning. As apparent from FIG. 2, the four main components of the neuron, e.g., neuron 11-1, are a local RAM 14-1 which stores the components of the prototype; a distance evaluator circuit 15-1 which computes the distance (e.g. the Manhattan distance) between the input vector and the prototype; a minimum circuit 16-1, which is required for ROI learning, as will be explained in more detail hereinafter and, finally, a daisy chain circuit 17-1, which is serially connected to two adjacent neurons chaining the neurons of the ANN. The daisy chain circuit in essential; or determining the neuron state, i.e., whether it is free or engaged.
FIG. 3 shows the circuit of FIG. 2 limited to neurons 11-1 and 11-2, wherein only the elements that are dedicated to ROI learning (and recognition as well) are represented. Focusing now more particularly on neuron 11-1, register 18-1 (which is integral to the RAM 14-1 of FIG. 2) is dedicated to store the category CAT. Comparator 19-1 compares the category stored in register 18-1 with the incoming category on input bus 12 in the learning phase, or the category obtained by ORing all the categories of the neurons which have fired and which appeared initially on communication bus 13, and subsequently, on input bus 12 during the recognition phase. Comparator 20-1 compares the global context CXT to the neuron (or local) context CXT stored in register 21-1. Comparator 20-1 generates an output signal which enables the selection of the minimum circuit of the neurons whose local context matches the global context. More details regarding minimum circuit 16-1 and daisy chain circuit 17-1 may be found in U.S. Pat. No. 5,717,832 to Steimle et al. (e.g., box 500 in FIGS. 5 and 8 and their related description) and U.S. Pat. No. 5,710,869 to Godefroy et al. (e.g., box 150 in FIGS. 5 and 16 and their related description), respectively. Signal ST is applied to all the daisy chain circuits of the ANN. The first free neuron is the one which has DCI and DCO signals in a complementary state. This complementary state is detected by an exclusive-OR circuit.
As apparent from FIG. 3, there is shown a further circuit, bearing numeral 22, which corresponds to the identification circuit referenced 400 in FIGS. 5 and 25 of the aforementioned U.S. Pat. No. 5,717,832. The function of logic block 22 is to generate a signal UNC (UNC stands for UNCertain) which is activated when an input vector cannot be classified with certainty. Still considering neuron 11-1, the signal which is outputted from comparator 19-1 is applied to the first input terminal of 2-way XOR circuit 23-1 which receives the L signal (L stands for Learning) at the second input terminal. The output of the XOR circuit 23-1 is connected to a first input terminal of a 2-way AND gate 24-1 which receives the F signal (F stands for Fire) on its second input terminal. As apparent from FIG. 3, all the outputs of AND gates 24-1, 24-2, . . . of neurons 11-1, 11-2, . . . are connected to an N-way OR gate 25. The signal UNC mentioned above is generated by OR gate 25 and is available to the external logic circuit 22 on pad 26.
Still considering FIG. 3, the circuit operation during the recognition of an input vector when in the ROI mode will now be described. Two cases must be considered:
1) If only neurons belonging to the same category have fired (for these neurons 11, the fire signal F is F=1; for all others, F=0), the ORing operation (through their respective minimum circuits 16) between the values stored in their respective category register 18 will output the appropriate category on communication bus 13. Comparators 19, which compare the categories CAT held in registers 18 to the value appearing on bus 12 (previously present on bus 13) will generate a 0, the XOR circuits 23 outputs and the UNC signal will both be at 0. Therefore, when UNC=0 and F=1, the input vector is recognized.
2) If neurons belonging to different categories have fired, the ORing operation of these categories results in a different value for UNC. For these neurons, the outputs of comparator 19 will be at 1 and, consequently, the corresponding XOR circuits 23, AND gates 24 and signal UNC will, likewise, also be at 1. The condition UNC=1 and F=1 implies that the input vector was not recognized and that an uncertainty exists on the classification. To remove the uncertainty, a learning phase is required, i.e., a Write category operation must be performed by the user to engage the first free neuron. This operation which is time consuming, prevents automatic learning when operating in an ROI mode. As a matter of fact, the learning/recognition process can be significantly improved if the KNN algorithm was implemented in an ANN. As known to those skilled in the art, most applications give better results in the KNN mode, because admittedly it is more precise. Note that it would be also worthwhile to implement the KNN algorithm in combination with the ROI algorithm, because AIF could be advantageously used to raise the uncertainty mentioned above. Unfortunately, until now, there is no known method and circuit offering to a hardware ANN a fully automated KNN learning when operating in a KNN mode . Consequently, no efficient classification can be performed during the recognition phase of this mode, because the user needs an analysis step during the classification process to determine whether or not learning has been performed.
It is therefore a primary object of the present invention to provide a method and a circuit that allow implementing automatic learning according to the nearest neighbor (KNN) mode in an artificial neural network (ANN).
It is another object of the present invention to provide a method and a circuit that allow implementing automatic learning according to the nearest neighbor (KNN) mode (or algorithm) in an artificial neural network (ANN) in combination with the ROI algorithm to minimize the uncertainty that remains after the recognition phase.
It is still another object of the present invention to provide a method and a circuit that allow implementing automatic learning according to the nearest neighbor (KNN) mode in an artificial neural network (ANN), the circuit being derived from an identification circuit.
In a first aspect of the invention, there is provided a method for implementing the automatic learning of an input vector in an artificial neural network (ANN) based on a mapping of the input space according to the K nearest neighbor mode (KNN) comprising the steps of: providing a plurality of neurons forming the ANN, wherein at least one neuron is free (not engaged); presenting the input vector to be learned to the ANN and proceeding to a Write component operation to store the input vector components in the first available free neuron of the ANN; assigning the category defined by the user to that input vector by performing a Write category operation; testing whether this category is the same as the categories of the closest prototypes (i.e. located at the minimum distance) stored in neurons previously engaged: if it is the same, the first free neuron is not engaged; if it is not, the first free neuron is engaged so that the input vector becomes a new prototype with the defined category associated thereto.
According to another aspect of the present invention, an automatic KNN learning mode can be advantageously implemented in hardware using, e.g., ZISC chips, wherein only a limited number of changes in the identification circuit are necessary for integrating the automatic KNN learning mode, in addition to the existing ROI learning mode.