The present invention relates to a defect classification method applied to industrial products utilizing a thin film technology and, more specifically, to a method for automatically setting classification criteria for semiconductor products with which detailed defect review and classification are considered important after defect detection through post-step inspection in the automatic defect classification method.
As semiconductors getting finer in structure, the semiconductor post-processing manufacturing process has become more complicated than ever. With the conventional process control based on the changing number of semiconductor defects detected by outer appearance inspection of semiconductor wafers, it is getting difficult to achieve a high yield for semiconductor manufacturing. In consideration thereof, proposed is Automatic Defect Classification (ADC) for automatically classifying any defects through analysis of images derived after inspection by an outer appearance inspection device. Alternatively, after outer appearance inspection, any defect parts are detected again for subjecting their detailed images to automatic classification.
Such ADC is varied in type, and so far proposed are: the rule type based on a rule predetermined for classifying, into defect classes, defect features containing a plurality of image features extracted from images such as image brightness and defect shape; the teaching type for automatically creating criteria for defect classification based on defect class distribution in a multi-dimensional vector space by regarding defect feature items each being a scalar value as multi-dimensional vectors; and the combination of the rule type and the teaching type. For automatically classifying any defects by ADC, prior to execution, there needs to set defect classification criteria based on defect samples known with their classification classes. The rule type generally requires setting of a determination threshold value each to various defect feature items, and the teaching type requires to derive defect class distribution in the multi-dimensional vector space.
With the conventional technology, the defect classification criteria is set based on the defect feature distribution prior to automatic classification. Thus, the problem of the conventional technology is that the defect feature distribution may look different once automatic classification is started, and if so, automatic classification cannot be appropriately done. For example, if the defect samples prepared at the time of classification criteria setting do not include any important ones, no classification criteria will be set for the defect classes having no samples. Thus, automatic classification cannot be normally done if no measure is taken therefor. Prior to automatic classification, it is unknown whether a target semiconductor layer has any defect different in type from the defect samples which have been already collected. Therefore, it is difficult to determine whether the classification criteria set at this point of time is good enough or not.
To solve such problems, for the purpose of achieving the best classification performance, it is considered necessary to make up for a shortage of any defect class samples that are not used for classification criteria setting, and update the classification criteria thereafter. For making up for the shortage, a lot of defects having been processed after automatic classification are referred to. The problem here is that, however, always monitoring defect classification requires a lot of efforts after once automatic classification is started. Thus, there is no more point for automatic classification, and such an operation cannot be practically carried out in the actual manufacturing lines.
As a conventional technology for solving such a problem, an exemplary method is proposed in JP-A-2001-256480. Specifically, teaching data is created for defect classification first by calculating the amount of features from defects detected from a semiconductor wafer to allocate the result to the feature amount space, and then by performing category assignment from the defect distribution in the feature amount space. At the time of defect classification, the same process is applied to any defects detected for classification from the semiconductor wafer for comparison with the teaching data. As a result of comparison, if any difference is observed therebetween due to varying semiconductor manufacturing processes, for example, the teaching data is to be accordingly corrected based on the observed difference.
The issue here is that the recent process control requires defect classification in a detailed manner, and to make it a reality, the number of dimensions in the feature amount space is getting increased. Therefore, it is difficult to automatically detect the change of the feature amount distribution because the defect distribution is initially derived from the small number of samples. Further, even if the classification criteria is determined as having a problem in its setting through defect monitoring after automatic classification is started, actually, the classification criteria has been considerably difficult to appropriately reset.
The reason is that it is generally impossible to guarantee whether resetting of the classification criteria leads to the classification result more accurate than the one as a result of defect processing in volume following the setting of classification criteria. This is because the defects that have been automatically classified in volume are unknown with their true classification classes. Moreover, resetting of the classification criteria causes the classification result to differ between before and after the resetting. This resultantly varies the number of defects on a classification class basis before and after the resetting, thereby causing inconvenience for process control.
What is more, general users have no idea what resetting of the classification criteria leads to the better classification performance as a whole. To reset the classification criteria, generally, the amount of defect features is calculated for every defect class in a defect cluster prepared in advance, and the resulting distribution is used as a basis for criteria setting. However, if every defect is subjected to feature amount extraction, the performance will be resultantly lowered especially if some defects are added later. For betterment, to derive the best performance, it is important to select only defects needed to obtain the best classification result prior to changing the classification criteria. However, it is still difficult for the general users to know selecting which defect leads to the best result. After all, the classification criteria is set by using every defect to derive the distribution of the defect feature amount. As such, resetting of the classification criteria hardly leads to the better performance.