The present invention relates to a method for automatic classification of data. Particularly, the present invention relates to a method for automatic classification of defects occurred on the surface of a semiconductor electronic circuit board, a printed circuit board, a liquid crystal display board or the like according to a detected image, an EDX detection spectrum, or the like.
Recently, methods for automatic classification by detecting an image of a defect portion have been developed in order to quickly grasp the situation of the defect occurred on the surface of a semiconductor electronic circuit board or the like and to monitor the number of occurrences per each type of defect.
For the automatic classification of images, various methods are conventionally studied in a field of pattern recognition.
One of conventional methodologies is a method called the learning type classification. According to this methodology, a teacher image is collected in advance and learned to optimize a classification apparatus (neural network, etc.). A learning type classification apparatus has a possibility that it can classify with flexibility in accordance with the request made by a user but has a disadvantage that it cannot be used substantially at the startup of a production process because it is generally necessary to collect a large volume of teach data so as to obtain good performance. It is known that, when a small volume of teach data alone is used, a phenomenon of excessive conformity of learning with the teach data, which is called overlearning, occurs, resulting in degradation in performance.
There is also another conventional methodology called a rule-based classification method. According to this methodology, a characteristic amount is extracted from an image to be classified, and the value of the characteristic amount is judged according to the “if-then” rule incorporated into the system to classify a defect into one of classes. A rule-based classification apparatus cannot respond flexibly to the request by the user because a class rule for classification is fixed but has an advantage that it can be used from the startup of the production process because teach data is not required.
The above-described rule-based classification apparatus and the learning type classification apparatus may be used together as one method. An example of such a method is disclosed in Japanese Patent Laid-Open Publication No. 2001-135692. Specifically, a defect is classified into a previously incorporated fixed number of classes (called the “core classification”) by the rule-based classification apparatus called the “core classifier” and further classified into an arbitrary number of “low-order classification” by the learning type classification apparatus called the “particular applicable classifier” which is associated with the core classification.
The example disclosed in the above-described patent publication uses the core classifier and can conduct the core classification from the startup of the process without necessity of collecting an amount of teach data. If classification in further detail is required, the classification can be made by the learning type “particular applicable classifier”.
The above-described prior art needs to decide previously a classification model combining the rule-based classification apparatus and the learning type classification apparatus. But, it is generally very hard to determine an optimum classification model in advance, possibly resulting in degradation in performance because the classification model is inadequate. Description will be made by examples below.
FIG. 2 to FIG. 4 show three types of classification models for classification of defects into four classes of an on-the-film foreign material, a below-the-film foreign material, a recess and a pattern defect. It will be described below that an optimum classification model is variable depending on a distributed state of defects.
FIG. 2 shows an example of a single layer classification model. A rule-based classification apparatus 21 corresponds to Section 1 and classifies into four classes of an on-the-film foreign material 22, a below-the-film foreign material 23, a recess 24 and a pattern 25.
The rule-based classification apparatus is superior to the learning type classification apparatus on the point that it can deliver stable performance as long as a designed rule adequately matches the target to be classified.
In a field of classification of defects, the causes of defects have become diverse with a technological evolution of the production process, and the classes for classification of defects have also varied accordingly. Therefore, it is hard to classify the defects of products, which are produced through different production processes, by using universal defect classification classes, and it must be said that a possibility of applicability of the rule previously assumed for a prescribed production process by a designer to the products produced by a different production process is very low. In this connection, the classification classes such as handwritten numeric recognition classification, etc. are considerably different from the setting determined at the time of designing.
FIG. 3 shows a double-layered classification model. A first layer's rule-based classification apparatus 31 classifies into three classes of a foreign material 32, a recess 33 and a pattern 34, and a second layer's learning type classification apparatus 35 further classifies the foreign material 32 into two classes of an on-the-film foreign material 36 and a below-the-film foreign material 37.
According to the example of the single layer classification model shown in FIG. 2, a detected defect is classified into one of the four classes desired by the user, while according to the model shown in FIG. 3, a probability of applicability of the designed rule can be made high on the point that the classification into three classes or any of them is conducted. Meanwhile, when the classification into the on-the-film foreign material or the below-the-film foreign material is conducted by the second layer's learning type classification apparatus, a possibility capable of classifying with reliability higher than the rule-based classification assumed by the designer becomes high in a condition that teach data on the on-the-film foreign material and the below-the-film foreign material is adequately large.
FIG. 4 shows a double-layered classification model of a type different from that shown in FIG. 3. The first layer classifies into three classes of a foreign material, a recess and a pattern, and the second layer further classifies the foreign material into two classes of an on-the-film foreign material and a below-the-film foreign material. A classification apparatus corresponds to Section 1 and Section 2 of the classification model. Here, it is assumed that Section 1 is a rule-based classification apparatus and Section 2 is a learning type classification apparatus.
In the example shown in FIG. 4, Section 2 is different from that of the classification apparatus shown in FIG. 3 and seems something different. The learning type classification apparatus of Section 2 is a learning type classification apparatus, which classifies the defect classified as a foreign material by the classification apparatus of the first layer into an on-the-film foreign material, a below-the-film foreign material or a pattern defect. The recess and the pattern defect can be separated with high reliability by the rule-based classification apparatus, but there is a possibility that the foreign material only is separated with high classification performance as compared with the model shown in FIG. 3 in a situation that it cannot be separated from the pattern defect.
Besides, a big difference of the classification model shown in FIG. 4 from the classification trees shown in FIGS. 2 and 3 is that it is a classification model different from a hierarchical relationship (semantic classification model) in terms of a classification concept the user has. The foreign material and the pattern defect are of exclusive classes from each other in terms of the user's classification concept, and the pattern defect cannot be located below the foreign material. But, the classification model achieving the maximum classification performance and the user's conceptual classification model can be independent mutually except when the bottom layer is a class finally classified by the user. It also suggests that it is hard for the user to determine an optimum classification model.
It can be said from the above that the optimum classification model for an automatic defect classification problem is variable depending on the problem setting. And, this problem setting (a level of adequacy of the rule by a designer, a possibility of collecting learn data, etc.) cannot be assumed in advance, so that an optimum classification model cannot be determined in advance either. As a result, there occurs a problem that the classification performance drops because the classification model is not optimum.
To achieve the maximum performance, the classification tree automatically provides an inherent and optimum classification model in response to a user's defect classification request, which is variable depending on the users, thereby improving the classification performance. Besides, it eliminates the necessity of manual setting of the classification model. It is hard for the user to previously determine the optimum classification model because it does not always match the conceptual classification model (generally called the semantic gap).