Current demands for high density and performance associated with ultra large scale integration require submicron features, increased transistor and circuit speeds and improved reliability. Such demands require formation of device features with high precision and uniformity, which in turn necessitates careful process monitoring, including frequent and detailed inspections of the devices while they are still in the form of semiconductor wafers.
Conventional in-process monitoring techniques employ an “inspection and review” procedure wherein the surface of the wafer is initially scanned by a high-speed, relatively low-resolution inspection tool; for example, an opto-electric converter such as a CCD (charge-coupled device) or a laser. Statistical methods are then employed to produce a defect map showing suspected locations on the wafer having a high probability of a defect. If the number and/or density of the potential defects reaches a predetermined level, an alarm is sounded, indicating that a more detailed look at the potential defect sites is warranted. This technique is known as “total density monitoring” of defects and produces a statistic called the “total defect density”.
When the defect density reaches a predetermined level, a review of the affected wafers is warranted. After a redetection procedure is carried out, using the defect map, to positively determine the presence of defects, a more detailed review procedure is carried out on the individual defect sites, such as scanning with a CCD to produce a relatively high-resolution image. The defect image is then analyzed to determine the nature of the defect (e.g., a defective pattern, a particle, or a scratch).
Since it has recently been recognized that monitoring “classified defect density”, i.e., the number of defects of several different types, or “classes” of defects, is preferable to monitoring total defect density, various methods for classification of defects have been introduced. Most of these conventional methods, called “classic classifiers” herein, employ pattern recognition techniques wherein a set of sample defects is acquired, imaged and analyzed for particular characteristics or “predicates” (e.g., brightness, roughness, size, color), which predicates are fed into a “black box” (e.g., a neural net) and used to train the classifier to recognize different types of defects by the defects' predicates.
Disadvantageously, the efficiency of these methods is reduced because there is no agreed-upon set of defect classes. Different semiconductor fabricators consider different defects to be important and, therefore, use different sets of defect classes tailored to their specific needs. Thus, they require many examples of defect images to be obtained for each defect class prior to becoming operational. Consequently, typical prior art systems cannot be used during start-up and ramp-up of a production line. Furthermore, because such classifiers, also referred to as “full classifiers” herein, need to discriminate between all defect types required to be classified (e.g., 10 or more classes of defects), a large number of predicates must be considered when classifying any defect, thus increasing inspection time and reducing production throughput.
To address the above-mentioned problems associated with full classifiers, an invariant core classifier (“core classifier” herein) has recently been introduced in the defect review system marketed as the SEMVision™, available from Applied Materials of Santa Clara, Calif. Such a core classifier is described in copending U.S. patent application Ser. No. 09/111,454, filed Jul. 8, 1998, entitled “Automatic Defect Classification With Invariant Core Classes”, the entire disclosure of which is hereby incorporated by reference.
According to the methodology of the copending application, after a defect map of a semiconductor wafer has been generated, each defect site and a corresponding known non-defective reference site is imaged by a scanning electron microscope (SEM) to gather and store location and topographical data. The image is then analyzed, as by performing boundary analyses and/or topographical measurements, to classify the defect as being in one of a number (e.g., seven) of invariant core classes of defect, and further classified as being in one of an arbitrary number of core sub-classes as desired by the user by adding pre-programmed modules onto the core classifier.
FIG. 1 is a conceptual flow chart of automatic defect classification into core classes performed by the methodology of the copending application. A defect 1 is classified broadly as a pattern defect 2A or a particle defect 2B, and further placed into one of seven exemplary invariant core classes of defects: craters and microscratches on the wafer surface 3A, a missing pattern on the surface 3B, an extra pattern on the surface 3C, a deformed pattern on the surface 3D, a particle on the surface 3E, a particle embedded in the surface 3F, or a particle and a deformed pattern on the surface 3G. Arbitrary core subclasses may include bridging (i.e., short circuiting) between neighboring wiring patterns, a small particle, a large particle, a broken line, a narrow pattern, etc.
The invariant core classification technique of the copending application enables defects to be separately and reliably classified as particle or pattern defects, and as on-surface or below-surface (embedded) defects. It also provides early quantification and notification of these meaningfully classified defects, thereby facilitating investigation of the causes of the defects, and enabling early corrective action to be implemented.
The core classifier of the copending application is a “rule-based” classifier in that it classifies defects by collecting defect information (i.e., imaging the wafer surface and performing boundary analysis and/or topographical measurement of its features) then following a set of rules programmed a priori (i.e., beforehand). Thus, it does not need to be trained, as do classic classifiers, and so does not require examples of defect images for each class prior to being operational. Therefore, unlike prior art defect classification systems, the core classifier of the copending application can be used during the start-up and ramp-up of a production line.
While core classifiers as described in the copending application address many of the shortcomings of conventional classic classifiers, core classifiers may not be suitable for separating defects into every class deemed important by a user since, as rule-based classifiers, they cannot be easily adapted to recognize new classes of defects. Specifically, the user may require refinements within the invariant core classes (since different process lines may be sensitive to different defects from one to another) other than the core subclasses discussed above available as pre-programmed modules to be added to the core classifier. Furthermore, the user may require refinements that cannot be discerned by the core classifier. For example, if the core classifier classifies a defect as a particle on the surface (core class 3F in FIG. 1), and the user wishes to know the shape of the particle in combination with its size, another technique must be used to obtain this size information, which is helpful in pinpointing the source of the particle, since different processes tend to produce different particle shapes and sizes. Additionally, “exotic” defects that do not fall into any of the core classes cannot be classified by a core classifier. For example, if a process is introduced that results in a new type of defect, the existing core classes will be irrelevant in relation to the new defect, and the core classifier will not be able to classify the new defect unless the new defect is added as a core class.
There exists a need to quickly and meaningfully review semiconductor wafers and automatically classify defects using a core classifier, then further classify the defects into subclasses within a core class desired by the user in order to identify processes causing defects, thereby enabling early corrective action to be taken. This need is becoming more critical as the density of surface features, die sizes, and number of layers in devices increase, requiring the number of defects to be drastically reduced to attain an acceptable manufacturing yield.