In manufacturing semiconductor devices, a defect review system is used to classify defects within a semiconductor process and can help in narrowing down the root cause of a defect or an excursion of the process. The defect review system does this by acquiring high resolution images around defect areas at a sub-micron level. Based on the acquired images, the system or an operator can classify the defects into categories in accordance with the type of the defects and how the defects may affect the production yield. If done using the system, this is an automated process. The current state of the art in automatic defect classification still requires operator intervention since typical automated techniques still leave a significant portion of defects unclassified.
Feature vectors that represent the defect review images are important to the accuracy of defect classification. Yet discriminating features are hard to discover and have are often maintained as secrets in many commercial defect review and classification systems. Features may be organized in a hierarchical manner. For example, a common lower-level feature is an edge detector, while a set of edge patterns in a neighboring area form middle-level cues such as parallel lines, corners, line junctions, etc. It is well known that most image processing techniques focus on extracting low-level features, and that designing features for high-level object representation is very difficult. In addition, features that can be used to classify one set of defect images may not work at all for other data sets. Thus, a new approach for discovering features that can represent mid-to-high level objects is needed.
In current defect classification practice, an operator sample a few defect images from each category, and spends significant time searching for features to separate unclassified defect images into corresponding categories. The process may be repeated for every layer of each new device in the semiconductor manufacturing process, which increases the time to ramp up a fab. Further, the classification results vary from one operator to another because an operator can choose different discriminating features based on his experience and understanding of the device. Such inconsistent classification causes unnecessary confusion or even contradiction in the process control of wafer manufacturing. It will be advantageous for operators if a system or method can automatically search useful features.
Many approaches have been implemented to automatically classify defect images. Most of the existing approaches involve two steps. First, features that characterize defect images are extracted and then classifiers are built based on the numerical values of features to assign a class code to each defect. The extracted features should have distinguishing power to separate one type of defect from another. For example, U.S. Pat. App. Pub. No. 2013/0279795 disclosed a method to use kernel function to transfer the region of a defect area to a real valued feature that can characterize the shape of the region. The classification approach based on the extracted features is usually a simple binary branched decision tree (such as the decision tree described in U.S. Pat. No. 8,660,340.
One well-known issue with the above mentioned approaches is the contribution of classifier. Typical current classifiers can classify 60%-70% of output defects from a device. However, the throughput of defect review systems in production environments makes it impossible for operators to manually classify the remaining images. For example, a known defect review system can output as many as ˜18,000-20,000 defects per hour. With a 60%-70% automated classification rate, it still leaves ˜6,000-8,000 defects per hour that need to be manually classified by an operator.
Systems have been developed that can improve on the contribution of the classifier by using complex machine learning approaches such as a Support Vector Machine (as described in U.S. Pat. No. 8,315,453). However, these systems require a training phase in production and an expert defined feature set, which can impact the production ramp as well as require highly trained operator to identify the feature set.