1. Field of the Invention
This invention generally relates to decision tree construction for automatic classification of defects on semiconductor wafers.
2. Description of the Related Art
The following description and examples are not admitted to be prior art by virtue of their inclusion in this section.
Fabricating semiconductor devices such as logic and memory devices typically includes processing a substrate such as a semiconductor wafer using a large number of semiconductor fabrication processes to form various features and multiple levels of the semiconductor devices. For example, lithography is a semiconductor fabrication process that involves transferring a pattern from a reticle to a resist arranged on a semiconductor wafer. Additional examples of semiconductor fabrication processes include, but are not limited to, chemical-mechanical polishing (CMP), etch, deposition, and ion implantation. Multiple semiconductor devices may be fabricated in an arrangement on a single semiconductor wafer and then separated into individual semiconductor devices.
Inspection processes are used at various steps during a semiconductor manufacturing process to detect defects on wafers to promote higher yield in the manufacturing process and thus higher profits. Inspection has always been an important part of fabricating semiconductor devices such as ICs. However, as the dimensions of semiconductor devices decrease, inspection becomes even more important to the successful manufacture of acceptable semiconductor devices because smaller defects can cause the devices to fail.
Automatic defect classification (ADC) of semiconductor defects is an important application of scanning electron microscope (SEM) review tools. One of the commonly used methods in the industry for performing this task are decision trees. For example, U.S. Pat. No. 8,502,146 to Chen et al., which is incorporated by reference as if fully set forth herein, describes a very effective ADC system using surface height attributes. An example is illustrated in FIG. 1, in which a simple decision tree classifier using two effective attributes can distinguish four types of defects. In particular, in rule based tree 100, first node 102 separates the defects based on topographical (“topo”) height. For example, as shown in histogram 104 for the topographical heights of defects detected on a wafer, cut line 106 for the first node separates void and scratch defect types from particle and cone defect types. The void and scratch defect types can be sent to node 108, which as shown in histogram 110 for the sizes of the defects, cut line 112 separates void defects from other defects, cut line 114 separates scratch defects from other defects, and defects falling between cut lines 112 and 114 can be separated into another bin for undecided defect types. In this manner, the void defect types can be put into bin 116, the scratch defect types can be put into bin 118, and the undecided defect types can be put into bin 120. Node 122 can be used to separate the particle and cone defect types in a similar manner using some characteristic of the particle and cone defect types. As such, the decision based tree shown in FIG. 1 splits classification into a series of easy and logical steps that can be performed for decision tree-based ADC.
Although the concept of decision trees is very simple to understand, manual construction of decision tree classifiers for practical applications is not simple at all. There are three major drawbacks in the traditional decision tree model. First, the complexity of the decision tree grows substantially quickly with the number of defect types to be classified. For example, an effective decision tree for more than ten defect types typically requires more than ten levels and hundreds of nodes, thereby becoming extremely difficult to build and manage manually. Therefore, intuitively simple decision trees become extremely complicated with increasing number of bins. Second, it is impossible to tune the performance (e.g., either accuracy or purity) of a decision tree for one defect type without affecting the performance of the decision tree for other defect types. Similarly, decision trees are difficult to maintain since tweaking one defect type can affect other defect types. Third, since the population is split by each node, the lower nodes have less and less population for deciding appropriate cut lines between types. There are, therefore, a number of drawbacks to ADC setup today.
One obvious solution to the difficulty of manual classifier construction is to algorithmically construct the classifiers automatically. Such automatic construction is actually a major research area in artificial intelligence (AI) and data mining, and there has been a long history of published results in this area. One of the most prominent examples is classification and regression trees (CART), which is commercially available in software products from Salford Systems, San Diego, Calif. In fact, the IMPACT software that is commercially available from KLA-Tencor, Milpitas, Calif. already has a feature called “starter-tree” that can automatically generate decision tree classifiers. However, in some instances, classifiers generated by automatic methods may over-fit the data and therefore in general may not be stable. Furthermore, the resulting classifiers are still one decision tree to classify all types, thereby still suffering from the second and third problems mentioned above. In addition, when decision tree based ADC causes every defect to be classified, the user has to go to extra lengths to leave room for unknown defect types (which is rarely done).
Accordingly, it would be advantageous to develop methods and/or systems for defect classification-related applications that do not have one or more of the disadvantages described above.