In today's data processing, a lot of recognition, prediction, and computation tasks are performed using reference databases used to characterize input data. Depending upon the application, these reference databases contain patterns that are sub-images, sub-signals, subsets of data and combinations thereof. The input patterns that are stored in these reference databases are referred to hereinbelow as prototypes. As known to those skilled in the art, they are generally represented by a vector, i.e. an array in a p-dimensional space. Well-known methods for characterizing new (unknown) input patterns using reference databases, are based upon the input space mapping algorithms like the K-Nearest-Neighbor (KNN) or the Region Of Influence (ROI). The base principle of these algorithms is to compute the distance (Dist) between the input pattern and each of the stored prototypes in order to find the closest one(s) depending or not upon predetermined thresholds. U.S. Pat. No. 5,621,863 assigned to IBM Corp describes artificial neural networks based on such input space mapping algorithms that include innovative elementary processors of a new type, referred to as the ZISC® neurons (ZISC® is a registered trademark of IBM Corp). An essential characteristic of the ZISC® neurons lies in their ability to work in parallel, i.e. when an input pattern is presented to the ANN, all ZISC® neurons compute the distance between the input pattern and the prototypes stored therein at the same time. One important aspect of these algorithms is the relation that is used to compute the distance, referred to as the “norm”, that is used in the distance evaluation process. The choice of this norm is determined by the problem to be solved on the one hand, and on the other hand, by the knowledge used to solve this problem. In a ZISC® neuron, the distance between an input pattern A and the prototype B stored therein (each having p components), also referred to as the “final” distance (Dist), is calculated using either the MANHATTAN distance (L1 norm), i.e. Dist=Σ (abs (Ai−Bi)) or the MAXIMUM distance (Lsup norm), i.e. Dist=max (abs (Ai−Bi)) wherein Ai and Bi are the components of rank i (variable i varies from 1 to p) for the input pattern A and the stored prototype B respectively. Note that “abs” is an usual abbreviation for “absolute value”. Other norms exist; for instance the L2 norm such as Dist=√{square root over (Σ(Ai−Bi)2)}. The L2 norm is said to be “Euclidean” while the L1 and Lsup norms are examples of “non-Euclidean” norms, however, they all imply the computation of a difference (Ai−Bi) for each component in the “elementary” (dist) distance evaluation. As a matter of fact, the “absolute value of a difference” operator, i.e. “abs (Ai−Bi)”, is extensively used in the ANN field, although other operators, such as the “match/no match” operator (also written “match (Ai,Bi)”), are more adapted to some specific situations. In the ZISC® neuron, the choice between the L1 or Lsup norm is determined by the value of a single bit referred to as the “norm” bit No stored in the neuron. Other Euclidean or non-Euclidean norms are known to those skilled in the art in the ANN field.
On the other hand, so far, only one operator has been used for the totality of the components of a stored prototype. For instance, in the ZISC neuron, the “abs (Ai−Bi)” operator is applied to each component to determine an “elementary” distance. Then, the successive elementary distances are summed in the case of the L1 norm or the maximum value thereof is selected in the case of the Lsup norm to determine the distance (Dist), also referred to as the “final” distance. However, due to the nature of the components, in some instances, it could be worthwhile to associate an operator that would be different for each component of the input pattern/stored prototype depending upon the application. For example, if the two components of a stored prototype characterizing a sub-image describe a color index and the number of pixels of that color index in the sub-image respectively, it would be useful to apply the “match/no match” operator for the color index related component and an “absolute value of a difference” operator for the number of pixels related component. The latter approach is described in EP co-pending patent application No 00480064.5 filed on Jul. 13, 2000 assigned to IBM Corp (attorney docket FR 9 1999 118).
In summary, as of yet, the operator is thus either the same for all the components of the input pattern/stored prototypes (see the aforementioned US patent) or can be different from one component to another (see the co-pending patent application), but, in both cases, it is predetermined. In other words, it is fixed once for all before the input pattern is presented to the ANN, and thus, it is not open to variations in the distance evaluation process. However, it could be worthwhile in some instances, that depending upon the result of the operation performed for a determined component in the distance evaluation process, the processing of other components can be modified or inhibited.
For example, let us consider some characteristics of a transportation system. The first component is used to code the transport type, e.g. plane or boat and the corresponding operator is “match/no match”. The second component is used to code the number of wings or rudders and the corresponding norm is based on an “absolute value of a difference” operator. The third component is used to code the power of the transport system, still with a “absolute value of a difference” operator. Let us assume now that a plurality of prototypes describing different planes and boats have been stored in the ANN. Let us now assume that an input pattern representing a plane is presented to the ANN, it would be desirable to perform the elementary distance evaluation on the second component of only the prototypes storing a plane and to exclude or inhibit the evaluation for prototypes storing a boat, because, obviously in the latter case, the computation of the elementary distance between the number of wings and the number of rudders would not be significant. The “match (Ai,Bi)” operator implicitly implies a condition, e.g. if Ai=Bi (i.e. “match”, then the result is equal to zero and if Ai≠Bi (i.e. “no match”), then the result is equal to one, but, whatever the result, 1 or 0, the distance evaluation process is continued with the second component. Obviously, it would be highly desirable that a condition be set on the result itself whenever necessary. For instance, if the result is 1 (i.e. “no match”), no elementary distance calculation will be made for the second component, or if made it will not be exploited. In summary, it does appear the necessity of a condition, e.g. a threshold attached to the result of an operation for a determined component that would impact the processing of the following components. For instance, the result of the condition could be the number of components that must be overlooked.
In addition, if the first component does not match, an offset, playing the role of a penalty, should be used for compensation purposes, to amplify the “no match” effect by significantly increasing the elementary distance for the first component. Otherwise, it would mean that if components match, the distances would be greater than if the components do not match, which would not be acceptable.
Finally, still another valuable requirement would concern the variability of the elementary distances evaluated for each component, so that it could be worthwhile to differently balance some of them, mainly when components of different nature are used. For example, let us consider again the previous transportation system in which the first, second and third components are used to code the transport type (plane or boat), the number of wings or rudders (the nature of the second component thus depends upon the nature of the first component which reflects the transport type) and the power. It is obvious that the number measuring the difference between the wings or rudders is quite small, if not negligible, compared to the number measuring the difference between the powers. Therefore, it can be necessary to balance some calculated elementary distances with a weight.
To date, these features are not achievable. Therefore, it would be highly desirable to define a “complex” operator (to be distinguished from the “simple” operators that have been used so far), that would be capable to undertake certain tasks in the distance evaluation process depending upon the results obtained at previous elementary distance calculations. To date, there is no known technique that would allow one to condition the calculation of the elementary distance for a component of the input pattern presented to an ANN during the distance evaluation process to the occurrence of an event. As a result, it is not possible to have a component designating objects of different nature (e.g. in the above example either wings or rudders), that would be very interesting in terms of component count reduction.
These features are not available in conventional ANNs to date for the following reasons. If implemented in hardware, a too considerable amount of memory and logic circuits would be required in the silicon chip, making this approach complex and expensive. On the other hand, their implementation in software is not realistic, because the time that would be required to execute the distance evaluation process. The great advantage of ANNs, i.e. their parallel structure that allows one to perform the elementary distance evaluations at the highest possible speed, would be lost.
This lack of implementation either in hardware or in software is a serious limit to extend the use of conventional input space mapping algorithm based neural networks and in particular of ANNs constructed with ZISC® neurons.