The present invention relates to neural networks in general and more particularly to a method of detecting and classifying anomalies in images/signals/sets of data using artificial neural networks and novel artificial neural networks resulting therefrom.
Detecting anomalies such as abnormalities or singularities in images, signals, and sets of data is an every day occurrence for some professions. For example, as far as images are concerned, consider a laboratory researcher analyzing microscope""s images containing foreign organisms or a photographer pouring over photos to find a flaw. Using the human eye, this process can prove painstaking and time consuming. The process can be further complicated when a particular anomaly represents a defect in part of an image, but not in others. Artificial neural networks appear to be perfectly adapted to handle this problem because they present the essential advantage of real time processing (particularly when implemented in hardware) and adaptation. Before going further, it is important to understand the principles that are at the base of artificial neural networks.
To date, some artificial neural networks are hardware implementations of the Region Of Influence (ROI) and/or the K Nearest Neighbor (KNN) algorithms. The physical implementation of such an artificial neural network provides a high degree of parallelism and adaptation, i.e. the ability to learn any type of input data, typically an input pattern or vector. This type of artificial neural network is schematically represented in FIG. 1 and will be referred to herein below as an ANN. The ANN referenced 10 in FIG. 1 is comprised of a logical decision unit and three layers: an input layer, an internal (or hidden) layer and an output layer. Each node of the internal layer is a processing unit which computes the distance between the input pattern which is presented to the input layer and the example which is stored within each node thereof. The learned data (or example) stored in each node of the internal layer is referred to as a prototype. The logical decision unit determines either the neuron which fired or the neuron which is the nearest neighbor (closest prototype) depending upon the algorithm being used. The output layer returns the appropriate categories (i.e. the categories associated to the neurons selected by the logical decision unit).
A conventional method of finding an anomaly (typically a defect) in an image is to input the entire image into the ANN, to compare it with reference images containing characteristic defects. To each of these images is associated a category characterizing the contained defect. For instance, during the production of semiconductor ICs, wafer maps are used to give a visual representation of the test and yield results. FIG. 2 shows wafer maps referenced 11-1, 11-2, 11-3 and 11-4 for four different wafers, each with a pattern of defective chips. Each one of the first three patterns constitutes a xe2x80x9cdefectxe2x80x9d for the process engineer but not the last one, because in this case the pattern is not characteristic of a specific IC fabrication process defect. These defects, at random, cannot be exploited by the process engineer to improve fabrication yields. As apparent in FIG. 2, a binary representation of each wafer map can be given, for instance, a 0 (white) for a good chip and a 1 (black) for a bad chip. However, it should be understood that this could be generalized, for instance by assigning a value comprised between 0 and 255 by a 8-bits coding. The analysis of these patterns (shape, location, number of defective chips, . . . ) is of paramount importance to the process engineer. The same defect can be correlated from a number of wafer maps and can lead to the accurate identification of its faulty origin (process, equipment, . . . ) to allow the adequate corrective action.
However upon further reflection, a difficulty concerning the volume of the data to be memorized in the ANN arises. To create a representative database of the images on hand using this method, it would be necessary to input many variations of the same defect. Under this conventional method even though some images might differ only slightly, the user is obligated to input the entire image into the database as references for subsequent comparisons. As a matter of fact, assume the ANN has learned the wafer map 11-1 with a category of defect A and the wafer map 11-3 with a category of defect B. Now wafer map 11-2 is presented for classification, it will be identified as close to the category of defect A of wafer map 11-1 when it should have been classified in category B. This mistake is due to the location of the pattern of defective chips on wafer map 11-2. The defects location on wafer map 11-2 is closer to that of wafer map 11-1 to that of wafer map 11-3, even the pattern is identical to the latter and quite different from the former. Similarly when learning the totality of an image, it is necessary to classify the entire image as containing or not containing a defect (and possibly the kind of defect), when often in reality, the defect is only a small part of the image. Take again the example of the wafer maps 11-2 and 11-3 in FIG. 2, the pattern of defective chips (marked in black) in each wafer map is a well localized defect which characterizes a process problem.
The aim of the classical method based on such ANNs is to associate with each prototype a category to identify it. The main drawbacks of this approach are the number and the size of the necessary prototypes. Because to obtain good results, it is necessary to learn many prototypes and to memorize all possible anomalies in every possible location. A possible solution to avoid this storage problem is to perform a local analysis, but then, the main drawback of this method would be ambiguity: from time to time a pattern would represent or not an anomaly depending upon its neighborhood, a problem which occurs when a prototype does not characterize an anomaly, or when the user does a bad category assignment to a prototype.
It is therefore a primary object of the present invention to provide a method for detecting and classifying anomalies in images/signals/sets of data using artificial neural networks (ANNs) and ANNs resulting therefrom.
It is another object of the present invention to provide a method for detecting and classifying anomalies in images/signals/sets of data using artificial neural networks (ANNs) which allows a significant reduction of the number of input vector/pattern components required for each neuron and of the number of required neurons.
It is another object of the present invention to provide a method for detecting and classifying anomalies in images/signals/sets of data using artificial neural networks (ANNs) which improves the neurons response accuracy.
It is still another object of the present invention to provide a method for detecting and classifying anomalies in images/signals/sets of data using artificial neural networks (ANNs) which improves the learning process by reducing the influence of errors or ambiguities of the user.
The accomplishments of these and other related objects is achieved by the method of the present invention, the main goals of which is to reduce the number of required neurons, to limit the size of the memory in each neuron by reducing the number of the required prototype components, and to improve their response accuracy in an ANN based upon a space mapping algorithm, such as the ROI or KNN algorithm. This invention finds applications in the field of anomaly detection and classification on images, signals, and sets of data. The method of the present invention is based upon xe2x80x9cprobabilitiesxe2x80x9d. Now, a prototype does not represent a category any more, but the xe2x80x9cprobabilityxe2x80x9d to belong to one (or several) category. This probability is determined from the neuron""s response accuracy and frequency which are stored in two counters. Thus, after having computed the xe2x80x9cprobabilityxe2x80x9d of each pattern of the image, signal or set of data, a second analysis is done to classify said pattern using its computed xe2x80x9cprobabilityxe2x80x9d and the xe2x80x9cprobabilityxe2x80x9d of all the neighboring patterns. To perform these two steps, two ANNs are used. The first one to which the incoming image, signal or set of data is presented evaluates the xe2x80x9cprobabilitiesxe2x80x9d and the second one, based upon these resulting xe2x80x9cprobabilitiesxe2x80x9d classifies each part of said incoming image, signal or set of data. Basically, the method of the present invention requires two phases as standard with ANNs: a learning phase and a recognition phase, but because two ANNs are now used, each of these phases is divided in two parts or sub-phases, one concerning the first ANN and the other concerning the second one.
According to one of its broadest aspect, the method incorporates an analysis phase which comprises the steps of:
generating patterns of data from a first sub-set of said set of data;
presenting said patterns of data to the ANN;
selecting the most representative patterns of data for memorization in the ANN as prototypes;
calculating for each prototype the probabilities that these selected patterns of data characterize said categories;
generating patterns of data from a second sub-set of said set of data; and,
performing the analysis of said patterns of data of the second sub-set using said representative patterns and said probabilities.
The novel features believed to be characteristic of this invention are set forth in the appended claims. The invention itself, however, as well as these and other related objects and advantages thereof, will be best understood by reference to the following detailed description to be read in conjunction with the accompanying drawings.