The present invention relates to a method and its apparatus for classifying defects. In particular, the present invention relates to the method and its apparatus for classifying defects occurring on the surfaces of a semiconductor electronic circuit substrate, a printed circuit substrate, a liquid crystal display substrate, and so forth based on a detected image, an EDX detection spectrum, and so forth.
In recent years, a method for automatically classifying an image of a defective portion has been developed for the purpose of quickly grasping the status of defects occurring on the surface of a semiconductor electronic circuit substrate or the like or for the purpose of monitoring the number of occurrences of defects for each defect type.
As the method for automatically classifying the image, various kinds of methods have been long studied in the field of pattern recognition.
One of the classical methods is a method called a learning classification. This method takes the steps of collecting teach images in advance and optimizing a classifier (neural network or the like) by learning the teach images. The learning classifier provides for a capability of flexibly classifying the image in response to a user's request, while disadvantageously it cannot be practical in starting a production process because ordinarily a massive amount of teaching data is required to be collected for obtaining an excellent performance. Conversely, in the case of using only a small amount of teaching data, the phenomenon of excessively adapting the learning to teaching data, called an excess learning, takes place. This phenomenon lowers the performance of the learning classifier.
Another classical method is a method called a rule-based classification. This method takes the steps of extracting a feature amount from a target image to be classified, determining a value of the feature amount based on the “if-then” rule built in the system, and classifying a defect into a proper one of the classes based on the feature amount value. The rule-based classifier is not able to flexibly respond to the user's request because the class rules for classification are fixed, while advantageously it can be used in starting the production process because no teaching data is necessary.
Further, JP-A-2001-135692 discloses a hybrid invariant adpative, automatic defect classification method for automatically classifying defects through the use of the combination of the foregoing rule-based classifier with the learning classifier. That is, in the technology disclosed in this publication, the rule-based classifier called a “core classifier” serves to classify a defect to a class having a fixed number built in advance (called “core classification”), and then the learning classifier called the “specific applicable classifier” related with the core classification serves to classify the defect into the “lower class” that may be divided by an optional number.
In the technology described in the foregoing publication, the use of the core classifier is said to make it possible to perform the core classification in the start-up of the process without having to collecting teach data items. Further, if more detailed classification is required, the learning type “specific applicable classifier” may be used for the classification.
However, the foregoing rule-based classifier and the methodology having the rule-based classifier built as its part, for example, the invention disclosed in the foregoing publication is restricted by the facts that the rule is fixed and that the class is also fixed. In the following, these restrictions will be described.
(1) Restriction by the Fixed Rule
In the rule-based classifier, the classifying rules built in advance has been made to correspond to the classifying classes. This may be an obstacle to realizing a high classifying performance. In actual, as to the classification into the classes for the “core classification” such as “particles” and “pattern defects”, the rule-based classifier has difficulty in realizing a high classifying performance to an ordinary user or a general process. This results from the fact that a certain class and the qualitative quality observed from an image of the defect belonging to the class are slightly (sometimes, largely) different in each user, that is, they are not invariant.
For example, the restriction will be described with reference to two classes of “particle defect” and “pattern defect”. As an example, for a user A, the “particle defect” is observed as having “a projecting geometry and an arbitrary area” and the “pattern defect” is observed as having “a tabular geometry and an arbitrary area”, while for a user B, the “particle defect” is observed as having “a projecting geometry or a tabular geometry and a small area” and the “pattern defect” is observed as having “a tabular geometry and a large area”. In this case, for the user A, whether or not it has a projecting geometry is an effective feature to identify the defect, while for the user B, not whether or not it has a projecting geometry but whether or not the area is small is an effective feature to identify the defect. Obviously, there are no common classifying rules that look at the projecting state or the area, based on which rules the “particle defect” and the “pattern defect” are distinctively classified.
That is, even for a highly general-purpose class which many users require, there is, in general, no invariable classifying rule that corresponds to the class. Being a highly general-purpose class and whether or not it is possible to classify the class by means of the rule-based classifier are a matter of entirely different problem.
(2) Restriction by the Fixed Class
The rule-based classifier includes the classes to be classified as built ones. Hence, it may not supply the user with the classes that meet with the user's request.
For example, in the technology disclosed in the foregoing publication, it is a presupposition that the rule-based classifier divides defects into the “particle” and the “pattern”. However, no substantial “pattern” defect takes place in a certain user's process. In this case, only the “particle” defect is enough. If the excess class is built as a rule, for the user, there may take place a disadvantage of lowering the performance by the originally unnecessary erroneous classification into the class.
In the technology described in the publication, the rule-based classifier further sub-divides the “particles” into “particle and transformed pattern”, “particle on the surface”, and “buried particle”. In some cases, this kind of sub-divisions may be unnecessary to the user. Or, a case may arise where since the high enough performance to the target process cannot be achieved, the sub-division is rather to be abolished.
Further, the criterion of sub-division may be required to be changed. That is, in place of subdividing the “particles” into the “particle and transformed pattern”, the “particle on the surface”, and the “buried particle”, the particles may be sub-divided according to the criteria of “large” and “small”. For the same reason, in some cases, it may be preferable to partially combine the classes or further sub-divide the classes.
As described above, since the classifying classes are built in advance, the rule-based classifier may supply the user with an unnecessary class, sub-divide the defects excessively, and impose a specific classifying criterion on the user. As a result, in some cases, the classification that meets the user's request cannot be often realized. Moreover, it may entail a degradation in performance.
As described above, disadvantageously, the conventional rule-based classifier or the methodology that has the rule-based classifier therein as a part may not be able to perform the classification or experience a degraded classifying performance if the nature of the data belonging to each class is different from user to user, because the classifying rule for each class is built in advance. Further, since the classifying class provided by the rule-based classifier is built in advance, the classifying classes that are just sufficient, not too much and not too little, to the user's request may not be provided. Further, the erroneous classification caused thereby may result in a disadvantageous lowering of the performance.
The methodology of the classification in the technology described in the foregoing publication, that is, the concept that classification is made into a predetermined number of “core classifications” through the use of the rule-based classifier, is based on the presupposition that there always exists a “core classification” like a “common class” that would meet the request by any user and the “common class” like “core classification” is executed on the common invariant classifying rule.
However, as described above, in actual, there exists no such class that could become a “common class” to the classification requested by any user. Further, even if there existed such a class that might become a “common class” to the classification requested by the most of the users, it is, in general, difficult to perform the classification based on the common invariant classifying rule.