1. Field of the Invention
The present invention relates to a method for controlling a processing apparatus, and more particularly to a method for controlling a processing apparatus in conformity with a new Adaptive Least Mean Square Neural Network (ALMS-NN) algorithm for evaluating a correction value of a stepper during a photolithographic process in an overall process of manufacturing semiconductor devices.
2. Description of the Related Art
As one way for strengthening competitiveness in the semiconductor industry, a great number of studies have been made for constructing an effective manufacturing system capable of ensuring a high production yield. Especially in case of a photolithographic process, which is one of the important semiconductor manufacturing processes, as the processing conditions are frequently varied, necessitating development of a systematic production system to deal with the frequent variation, considerable efforts for establishing a system that reduces the frequency of sampling processes are being directed to enhance the production yield.
It is a misalignment problem in the processing that is to be primarily considered during the photolithographic process for establishing such a system. The problem chiefly occurs due to a difficulty in analyzing the physical and chemical characteristics of the process, noise interference during performing the process, and measurement errors after performing the process, which become reasons for increasing the frequency of the sampling that directly affects the production yield.
A Process Control System (PCS), widely developed and employed as a production system capable of ensuring a high production yield for the semiconductor manufacturing system, mostly involves no mathematical model with respect to the process steps, but controls the process in view of numerical values statistically obtained by processing data of the previously performed processing.
The scheme for reflecting the previous experience values of prior processing in a current process as a kind of xe2x80x98experience inheritance,xe2x80x99 is an algorithm that feeds back a mean of weighted values with respect to recent historical data of identical processes. However, such an algorithm is static, without considering the temporally changing characteristics of a system, having a drawback that the sampling process has to be repeatedly performed because of insufficient identical history data within a fixed period, or successive spec-out generation.
For this reason, processing controller design techniques using a neural network model have been currently suggested as a scheme for properly dealing with a process of a non-linear system without having a specific mathematical model (xe2x80x9cMonitoring and Control of Semiconductor Manufacturing Processes,xe2x80x9d IEEE Control System, 1998 by S. Limanond, J. Si, and K. Tsakalis; and xe2x80x9cArtificial Neural Network Model-Based Run-to-Run Process Controller,xe2x80x9d IEEE Trans. on Component, Packaging, and Manufacturing Technology-Part C, vol. 10, no. 1, January 1996 by X. A. Wang and R. L. Mahajan).
First, there has been suggested a method for allowing the neural network to learn by means of data from previously conducted processes, thereby predicting the manufacturing process via a pattern search with respect to a previous process. The basic premise of this method is in that the variation pattern of the non-linear system is not completely random. Accordingly, it is possible to predict that the past history data involving a pattern similar to the recent variation pattern may exist, unless the variation pattern of the system with respect to a certain stepper is completely random, so that the variation pattern of the past system is utilized to presume a current output value.
However, this method has disadvantages of being liable to induce a difficulty in using the data because it requires a lot of past history data to be effectively applied, and increasing the amount of computing required because of the continuously repeated pattern search and neural network learning.
Second, a widely utilized Exponential Weighted Moving Average (EWMA) system provides modeling and approximating methods of a system having a data variation capable of being described in a time series system, which are widely available in the processing control field of semiconductor manufacturing. This has been disclosed in literature such as xe2x80x9cRun by Run Process Control: Combining SPC and Feedback Control,xe2x80x9d IEEE Trans. on Semiconductor Manufacturing, vol. 8, no. 1, February 1995 by E. Sachs, A. Hu, and A. Ingolfsson; xe2x80x9cAdaptive Optimization of Run-to-Run Controllers: The EWMA Example,xe2x80x9d IEEE Trans. on Semiconductor Manufacturing, vol. 13, no. 1, February 2000 by N. S. Patel and S. T. Jenkins; and xe2x80x9cA Self-Tuning EWMA Controller Utilizing Artificial Neural Network Function Approximation Techniques,xe2x80x9d IEEE Trans. on Components, Packaging, and Manufacturing Technology-Part C, vol. 20 no. 2, April, 1997 by T. H. Smith, D. S. Boning.
The EWMA system is frequently applied to the actual operation of semiconductor devices, as its model is simple and it also is possible to apply a simple recursive formula as the equation written by:
{overscore (x(x))}=xcexx(ixe2x88x926)+(1xe2x88x92xcex){overscore (x)}(ixe2x88x921) 
However, if the xcex value used is small, non-negligible weight is applied upon past data when applying the EWMA system, causing a drawback of requiring a lot of past data for performing accurate estimation.
Third, the system predication technique by means of the Kalman Filtering is a classical predication technique with respect to a system of which motion characteristics are fundamentally modeled by a differential equation or difference equation in a state-space form and are interrupted by a white noise.
Based on the fact that the correction value of the system is generally changed by the noise, even though it involves no change without being interrupted by the noise, the variation characteristic of the correction value is assumed to be a linear model as follows:
x(k+1)=x(k)+w(k) 
Here, a reference alphabet w(k) denotes a white noise term that becomes a cause of varying the correction value.
At this time, the performance is determined in accordance with the supposed system model how the model by reproduces the motion characteristics and noise characteristics of an original system. However, it is not easy to provide a supposed model similar to an original system in such a system that has the highly linear characteristics like the semiconductor process.
In order to solve the above-enumerated problems of the conventional technique, an object of the present invention is to provide a method for controlling a processing apparatus having a new ALMS-NN algorithm, capable of being effectively applied to a process that highly depends on a sampling due to a frequent replacement of work pieces subjected to the processing without depending on a number of past history data.
Another object of the present invention is to provide a method for controlling a photolithography apparatus having a new ALMS-NN algorithm for deciding an apparatus input value capable of effectively correcting an overlay alignment error by using a stepper apparatus during a photolithographic process as a target.
Accordingly, there is provided a method for controlling a processing apparatus, in which an error value between an input value of the processing apparatus for processing a work piece to be processed, and a measurement value obtained by measuring the work piece processed in the processing apparatus is obtained, a correction value for correcting the input value of the processing apparatus is computed in order to decrease the error value, and the values are managed as processing data to be utilized in computing a next correction value. The previous processing data having a history identical to the work piece loaded to the processing apparatus are searched, and a current bias correction value is predicted from the latest plurality of previous correction values out of the searched previous processing data having the identical history. Also, a current random correction value (RAND) is predicted by means of a neural network on the basis of the latest plurality of previous RAND correction values out of the previous processing data, and the predicted bias correction value is summed with the random correction value (RAND) as a current correction value of the processing apparatus. By using the error value, the neural network is made to learn for tracking the variation of the RAND correction value.
In more detail, for effectively predicting the correction value x(n) in the present invention, the correction value x(n) is divided into a bias component xbias(n) correlated with the history and a random component xrand(n) with the random property of which the reason of variation cannot be definitely perceived. Here, xbias(n) is predicted on the basis of the history of the corresponding lot. In connection with the xrand(n) component which impedes a definite prediction of x(n), because it is nearly impossible to accurately predict the RAND component due to its random property, its variation is tracked by using a feedback propagation learning of the neural network to minimize the prediction error of x(n).
At this time, as the xrand(n) component has a property of eliminating the correlation with the history of the corresponding lot, all data can be utilized regardless of the history of the corresponding lot.
Therefore, in view of x(n) computed by the algorithm suggested in the present invention, if the temporal variation of xbias(n) is not so large, the restriction of data expiration having been a problem in the traditional correcting system can be solved. Also, even when the external factor affecting x(n) in addition to the history of the corresponding lot is varied, it can be effectively managed by using the learning capability inherent to the neural network.
Accordingly, the dependence upon the previous history data is decreased even in a manufacturing line involving a number of device changes, with the consequence of remarkably reducing the number of sampling process.
The step of predicting the current bias correction value is performed by a section linear weighted mean algorithm defined by the equation as follows:       x    bias    =            1      W        ⁢                  ∑                  i          =                      n            -            W            +            1                          n            ⁢              [                                            W              +              n              -              1                                                      ∑                                  j                  =                  1                                W                            ⁢                              xe2x80x83                            ⁢              i                                ⁢                                    x              sh                        ⁢                          (              i              )                                      ]            
where a reference alphabet xbias denotes the bias correction value, W denotes a section and, xsh denotes the previous bias correction value having the identical history.
The current RAND correction value is obtained by tracking in the direction of decreasing the error of the RAND correction value by means of an error feedback propagation learning method via a multilayer perceptron.
To achieve another object of the present invention, there is provided an apparatus for controlling a photolithography apparatus wherein an error value between an input value of the photolithography apparatus for processing a photoresist over a wafer, and a measurement value obtained by measuring a photoresist pattern subjected to an exposure and a development in the processing apparatus by means of an overlay measurer instrument is obtained, a correction value for correcting the input value in the direction of decreasing the error value is computed, and then photolithographic processing data in the production time unit are managed for utilizing the values in computing a next correction value. The previous processing data having a history identical to that of a new lot loaded to the photolithography apparatus are searched, and a bias component of a current correction value is predicted from the latest plurality of previous correction values out of the searched previous processing data having the searched identical history. A RAND component of the current RAND correction value is predicted by means of a neural network on the basis of the latest plurality of previous RAND correction values out of the previous processing data, and the predicted bias component is summed with the RAND component as a current correction value of the photolithography apparatus. The error value is used for making the neural network learn to track the variation of the RAND component.
Here, in the searching step, data having the identical reticle, PPID, base I and base II that are the history constituting elements is detected as identical history processing data.
In addition, if no processing data of identical history exists in the searching step, the bias portion of the correction value is guessed in accordance with the priority of remaining elements among processing data having the identical reticle element.
In association with the guessing method, the processing data having any one element different among history constituting elements are extracted, the bias component of the correction value is guessed by using a relative value of any one constituting elements among the extracted processing data with the single different constituting element, and the bias component of the correction value is guessed by obtaining a mean value of the extracted processing data with the single different constituting element if the bias component cannot be computed by means of the relative value.