The complex interrelationships between different processing steps, and the long sequence of steps that must be performed to produce a functional device, make it difficult to diagnose the causes of misprocessing during semiconductor manufacturing. Often improper processing at one step may cause a step later in the flow to perform inadequately. For instance, non-uniform application of the photoresist in one step may cause a future etching step to etch the wrong portions of the wafer. In addition, the large number of processing steps makes it difficult to isolate the effect of one step on the final product.
The diagnosis task is further complicated due to limited observability. Limited observability refers to the fact that one can measure only a few variables of interest during processing. Most approaches to equipment diagnosis tend to fall into two categories. The first category involves putting additional sensors on individual pieces of the processing equipment to detect specific faults. Providing sensors dedicated to particular equipment and fault type can be quite expensive, and they enable the detection of only a small number of faults. The approach of introducing known faults and obtaining a signature for these faults in specialized test structures can also be classified as belonging to this category of techniques. Here too, only the faults introduced during experimentation can be detected, and one cannot be sure that a new fault is not confounding with the previously introduced faults. The second category of approaches involves constructing models of the operation of the equipment, or the process, and inferring the state of the equipment and/or process by measuring selected output parameters. The use of response surface models and process simulators for diagnosis falls in this category. Model-based approaches do not have a high hardware cost but are limited to diagnosing faults in only those parameters that are comprehended in the models. The accuracy of the models also impacts the effectiveness of these techniques.
Another prior art method, the Wafer Sleuth system, involved the use of wafer tracking information for the isolating the causes of misprocessing. The Wafer Sleuth system employed optical character recognition to read an identifying number, and the location in a carrier, for each wafer as it went through the different processing steps. Later, this information was used to suggest causes of misprocessing. One limitation of the Wafer Sleuth system is that the task of making queries to the wafer tracking database is manual, as is the task of determining whether a query contains information helpful for fault isolation. Since the number of queries that need be examined for fault-isolation can be quite large, this manual process can be quite tedious and time consuming. For instance, suppose that a flow of 100 steps is required to make a device. Furthermore, suppose that the device is considered functional if five parameters are within specified limits. Even if one restricts attention to checking whether all the wafers coming from one machine exhibit values of any parameter different from the wafers coming from other machines, one has to construct and evaluate 500 queries. This can be very tedious, time consuming, and error-prone for a person. Of course, a person may not look at all possible queries, but may only choose to look at those that are most likely to contain fault-isolation information. However, today's VLSI circuits require 200-300 steps, and roughly 100 parameters are checked. Consequently, even an selective search of the space of possible queries can be tedious and time consuming for a person. Moreover, often misprocessing is due to effects that are not obvious, and therefore were not considered likely.