The present invention relates to a method of fault detection in manufacturing equipment, especially, but not limited to, semiconductor manufacturing equipment using plasma chambers.
The manufacture of semiconductor integrated circuits is a detailed process requiring many complex steps. A typical semiconductor manufacturing plant (or fab) can require several hundred highly complex tools to fabricate intricate devices such as microprocessors or memory chips on a silicon substrate or wafer. A single wafer often requires over 200 individual steps to complete the manufacturing process. These steps include lithographic patterning of the silicon wafer to define each device, etching lines to create structures and filling gaps with metal or dielectric to create the electrical device of interest. From start to finish the process can take weeks to complete.
Faults can and do occur on these manufacturing tools. A fault on a single wafer can compromise all devices on that wafer and all subsequent steps on that wafer may be worthless and the wafer scrapped. Thus, timely and effective fault detection is a necessity. An example semiconductor manufacturing tool is depicted in FIG. 1 and shows a plasma processing chamber 1, a substrate to be processed 2, process inputs or set-points 3, tool-state and process-state sensor outputs 4 and a data collection interface 5.
The manufacturing tools are complex and many different faults can occur, some specific to the tool process being run at the time, that impact tool productivity and yield (in the case of a plasma chamber, the process being run at any given time is known in the art as the “recipe”). As an example of the type of faults that can occur, consider a thermal chemical vapour deposition (CVD) tool, used to deposit layers of semiconductor or dielectric materials in the device manufacture. The quality of the process is determined by the output, measured by some metrics such as film uniformity, stress and so on. The quality of the output in turn depends on the process inputs, for example gas flow rates, reactor pressure and temperature in the case of the thermal CVD tool. If there is a deviation in any of the process parameters, then the quality of the output may be negatively impacted.
Another type of fault concerns excursions in the process itself. There are many examples, including a compromise in chamber vacuum, a change in reactor wall conditions or chamber hardware, an electrical arc or even a problem with the incoming wafer. Again the quality of the output will be affected with possible impact on tool yield.
A common feature in all of these faults is that sensors on the tool will generally indicate a change in system state, although this does depend on the sensitivity of the tool sensors. Plasma processing chambers are typically equipped with tool-state sensors, for example gas flow meters and pressure gauges, and process-state sensors, for example optical emission detectors and impedance monitors. If a process input changes, then, generally, some of the tool sensors will register that change. If the process reactor conditions change, again the tool sensors will register a change.
The most common approach to process control and fault detection on semiconductor manufacturing tools is Statistical Process Control (SPC), whereby many if not all of the process inputs are recorded and control charts are monitored for out-of-control events. FIG. 2 shows a typical SPC chart based on sensor data from a semiconductor manufacturing tool. Control limits are based on statistically improbable deviations from the data mean. They are shown as an Upper Control Limit (UCL) and a Lower Control Limit (LCL) in FIG. 2. Typically these limits are set at 3 or 4 times the standard deviation (sigma) from the mean of the data set, using a normal distribution model. This control technique has a number of limitations.
The first problem is that monitoring all SPC charts is not scalable, since there can be ten's of sensors per tool and several hundred tools in the fab. The second problem is that individual sensor outputs can stray outside control limits, with no apparent effect on the process output and/or process inputs can remain within control limits but process output can drift out-of-control due to changes in the process conditions. This is because the processing tools are typically complex and their output depends on their combined inputs as well as the conditions of the tool itself. It is for this reason that the semiconductor fab usually uses regular process quality sampling on test wafers since this is at least predictive of yield. For example, test wafers are frequently run to check process quality such as film stress in the case of a CVD process or critical dimension (CD) in the case of an etch process. This is known to be a very expensive approach to process control, since running test wafers and halting real production to test process quality negatively impacts factory yield and productivity. The third problem relates to the difficulty of setting SPC limits on the tool sensors. The SPC approach is statistical and assumes normally distributed data. This is generally not the case. Tool and sensor drift as well as normal tool interventions such as preventive maintenance (PM) activity result in a data set which is not normally distributed.
FIG. 3 shows two data streams for output parameters 1 and 2 from a sensor in an oxide etch plasma processing tool over a period of about 1100 wafers, during which time a pressure fault was detected at wafer number 1018. The fault was caused by a defective pressure controller. Two preventative maintenance (PM) wet-cleans of the plasma chamber were carried out in the interval preceding the fault. These PM events and chamber cycling effects are clearly visible in the raw data. It will also be seen that the data is highly non-normal, with auto-correlation and discontinuities. The SPC approach therefore cannot handle this data effectively and significant events can be lost in the data. Indeed, in the example of FIG. 3, the fault which occurred at wafer 1018 is impossible to pick out of the data using the SPC approach.
Multivariate statistical techniques have been used in an attempt to offset the first two problems mentioned above (e.g. U.S. Pat. No. 5,479,340). Multivariate techniques take into account not only the individual variance of the control parameters, but also their covariance. This addresses some of the shortfalls of SPC techniques in that the multivariate statistic can be used to compress the data and thus reduce the number of control charts resulting in a more scalable solution. For example, it is possible to replace a multitude of sensor data streams with a single statistic, such as a Hotelling T2, which captures the individual sensor variance and sensor-to-sensor covariance. Using these techniques the number of control charts is greatly reduced and the single statistic is more representative of overall system health.
However, since the multivarate approach is statistically based, the third problem is not addressed. This is illustrated in FIG. 4, which shows a Hotelling T2 statistic based on the sensor data including the streams shown in FIG. 3 (as well as streams for many more sensor output parameters). As mentioned, there is only one fault event in this data set, that occurring on wafer 1018. All other data, including drift and PM discontinuities are normal. However, this single multivariate statistic reports a couple of statistical excursions with greater than 99% confidence because they deviate from statistically normal behaviour, but misses the real fault condition. The multivariate statistical approach has an additional shortcoming. The magnitude of the excursion is difficult to interpret, again because it is statistically based. A large deviation in the statistic may not necessarily correspond to a very significant process quality issue, whereas a small deviation may occasionally indicate a major process excursion.
A further issue arises when using the statistical approach in a multi-tool semiconductor manufacturing site. In practice, plasma processing chambers are not perfectly matched. Sensor responses on one chamber are not identical to, and may differ substantially from, sensor responses on another chamber of the same type (i.e. built to the same nominal specification), even when running the same recipe. Therefore, a statistical fault detection model cannot be transferred from one chamber to another, as small differences in sensor response would trigger a false alarm. The statistical model needs to be derived from chamber to chamber. This is a further limitation in the approach.
As mentioned above, as well as statistical monitoring of manufacturing equipment, process control in the semiconductor industry uses regular process quality sampling. Indeed, since yield is directly determined by process quality, ultimately this is the most robust technique. However, measuring the process quality of every wafer at every process step, in particular taking measurements from the wafer, is prohibitive in terms of reduced factory throughput and cost of measuring equipment. U.S. Pat. No. 5,926,690 describes a method for process control on an etch tool based on measuring CD (critical dimension) and controlling the process by varying etch time based on the measurement. A single process quality output, CD, is controlled by selectively altering a single process input, photoresist etch time. If the film measurement tool is integrated with the etch tool then the CD can be measured before and after every wafer is run and adjustments made on the fly. This method of process control relies on precise measurement of the CD and determining if a change is significant or not on all wafers or a reasonable statistical sample. However, the reliance on accurate determination of, in this case, CD, or in the general case, a process quality metric, makes the technique very expensive to operate. An alternative approach in which it is not necessary to have a precise measurement of a process quality metric would be advantageous.
Another concept for process control is described in U.S. Pat. No. 6,174,450. In this case, a single process parameter, namely direct current bias, is controlled by varying RF power. The concept is that by fixing a particular process input, a particular process output will be better controlled. One problem with this approach is that the process output depends on several inputs and unless all are controlled, the process output cannot be inferred.
A separate but related problem is that of tool matching. Typically, the manufacturing plant is set up in process lines, each line devoted to a particular process step. For example, the fab will contain a lithography line, an etch line, a deposition line and so on. Wafers are processed through each line as the process of building the devices proceeds. Each individual line will consist of a similar set of tools, each with at least one plasma processing chamber. A typical fab may contain tens of similar chamber types, devoted to a set of process steps. These process steps are each assigned individual recipes and as a particular device is being processed many chambers will be employed to run a given recipe on all wafers processed in the manufacturing plant. Ideally, a recipe run on any given process chamber will produce the same output in terms of device quality as on all other similar chambers. For example, running a particular etch recipe, ideally all of these chambers etch the wafer at the same rate, with the same across-wafer uniformity, and so on. However, as discussed, differences between outwardly similar chambers can and do occur, resulting in a mis-matched output set. This mis-match ultimately impacts factory productivity and yield.
The chamber-to-chamber mis-match is presently dealt with in a couple of ways. Firstly, every attempt is made to design processes with wide operation windows so that small chamber-to-chamber differences have a negligible effect on the process output. Secondly, large differences in chamber output are tolerated by device sorting according to final specification; for example, speed binning in the case of micro-processor manufacturing. Thirdly, every attempt is made to make all chambers the same. This can involve trial-and-error parts swapping as well as extensive calibration checks and it is generally a laborious approach.
As semiconductor fabs begin to process devices with transistor gate lengths and line-widths less than 100 nm, process windows have become increasingly tight exacerbating the impact of chamber-to-chamber output differences. Device specification sorting is expensive as below par devices have much lower market value. Finally, the effort to make all chambers the same by trial-and-error parts swapping and calibration checks is a diminishing returns equation, since in many cases great time and effort can be spent on the problem.
Measuring chamber output is a sure way of determining output differences. Indeed regular process quality checks are generally employed in fabs to do just that. These quality checks are generally ex-situ and a time delay is inevitable between processing a set of wafers and knowing if the output differences will impact yield. Ex-situ monitoring is an increasingly expensive approach and it would be much more advantageous to determine chamber-to-chamber differences prior to the ex-situ determination of output quality.
As mentioned, the sensor responses on one chamber may differ substantially from sensor responses on another chamber of the same type running the same recipe. These differences will reflect some or all of the following:    (a) “real” chamber-to-chamber differences which will be manifested in the output from these chambers,    (b) benign chamber-to-chamber differences based on chamber condition, build tolerance and chamber life-cycle, and    (c) small differences in the outputs of the sensor set on each tool due to different calibration margins.
The problem with using the raw sensor data to determine (a) above is that it is confounded by (b) and (c).
Isolating chamber-to-chamber differences in real time provides the fab operator with definitive information on process quality output from a given fab line. Having isolated a poorly matched chamber, the next step is to return that chamber to a state which matches the line set. As stated above, the approach is often trial-and-error, involving parts swap-out and calibration until the chamber outputs are matched. Real-time classification of the root cause of chamber differences would be far more advantageous.
U.S. Pat. No. 6,586,265 recognises the chamber mis-match problem and discloses a method for optimising process flow based on choosing an optimum processing path through a set of process lines. This approach makes no effort to solve chamber mis-matches and badly matched chambers would be used as little as possible.
In the March 2003 proceedings of the European Advanced Process Control Symposium, a method for isolating chamber differences during tool manufacture and test was disclosed. This method collects all sensor data associated with individual process chambers on a given tool and constructs a principal component model (PCA) of the sensor data set. PCA effectively captures all process variance from a correlated multi-variable data set (the sensors) in a set of uncorrelated principal components, each a linear combination of the original set. The first principal component accounts for as much as possible of the variation in the original data, the second component accounts for as much as possible of the remaining variation and is not correlated with the first component and so on. It is generally found, particularly when the sensor data set is correlated as on the process tools, that the majority of the variance is captured in the first few principal components. Therefore, plotting tool sensor data in PCA space allows the user to view most of the sensor variance easily and capture chamber-to-chamber differences. However, the variance as viewed in PCA space remains a confounding of real (output-impacting) chamber differences, benign chamber differences and sensor set differences. Furthermore, there is no provision for classifying the underlying root cause of the difference.
It is therefore an object of the invention to provide an improved method of fault detection in manufacturing equipment, especially but not limited to semiconductor manufacturing equipment using plasma chambers, which can be used to avoid or mitigate the problems of process control and chamber matching as discussed above.