For ease of discussion, a few terms have been defined below.
Data set—a record of measurements as a function of time for a parameter on the processing tool.
Change point—a point in a time series at which some change occurs
Endpoint—a point in time at which a process (e.g., etching of a silicon layer) has reached or is near completion.
Endpoint domain—an interval in a data set during which an endpoint is thought to occur. An endpoint domain is usually relatively broad and is based on user estimate.
Partial Least Squares Discriminant Analysis (PLS-DA)—a technique for finding relationships between two sets of data. PLS-DA may be used when there are multiple independent variables (in an input matrix X) and possibly multiple dependent variables (in an input matrix Y). In PLS-DA, the Y variables are not continuous but consist of a set of independent discrete values or classes. PLS-DA may try to find linear combinations of the X variables that can be used to classify the input data into one of the discrete classes.
Pre-endpoint domain—a part of the data set that precedes the endpoint domain.
Post-endpoint domain—a part of the data set that comes after the endpoint domain.
Signature—a distinctive change point (or combination of change points) in the evolution of a parameter or combination of parameters which indicates an endpoint in a process. The combination of parameters and the nature of the change usually form part of a signature.
Stepwise regression—refers to fitting a straight line using a least-squares fitting algorithm to the data values in a finite temporal interval of the data from an individual sensor channel.
Advances in plasma processing have provided for growth in the semiconductor industry. To gain a competitive advantage, semiconductor device manufacturers need to maintain tight control of the processing environment in order to minimize waste and produce high quality semiconductor devices.
One method for maintaining tight control is by identifying a process endpoint. As discussed herein, the term endpoint refers to a point in time at which a process (e.g., etching a silicon layer) has reached or is near completion. The process of identifying an endpoint may be as simple as identifying a signal with the largest change. However, a signal change may not always coincide with an endpoint. Other factors, such as noise in the channel, may cause the signal pattern to change.
To facilitate discussion, FIG. 1 shows a simple method for establishing an endpoint algorithm. The method as described in FIG. 1 is usually performed manually by an expert user, for example.
Consider the situation wherein, for example, a test substrate is being processed. Since there are different types of substrates, the test substrate tends to be of the same type as the substrate that may be utilized in a production environment. For example, if a specific patterned substrate is utilized during production, a similar patterned substrate may be employed as a test substrate.
At a first step 102, data is acquired for a substrate. In an example, sensors (such as a pressure manometer, an optical emission spectrometer (OES), a temperature sensor, and the like) acquire data while the substrate is processed. Data for hundreds, if not thousands, of sensor channels may be collected.
After the substrate has been processed, the data that has been collected may be analyzed. Since a plethora of data may be available, finding an endpoint within thousands of signal streams may be a challenging task that usually requires in-depth knowledge of the tool and recipe. For this reason, an expert user is usually charged with the task of performing the analysis.
At a next step 104, an expert user may examine one or more signals for changes in the signal patterns. The expert user may employ one or more software programs to assist with the analysis. In an example, the software program may be a simple analysis tool that may perform simple calculations and analysis. In another example, the software program may be a simple data visualization program that may be employed to graphically illustrate a signal history, for example.
However, even with the expert user's expertise and experience, the volume of data acquired by the sensors and available for analysis may be overwhelming. Accordingly, the task of identifying an endpoint signature can be a daunting task. In an example, there may be over 2,000 wavelength measurements within an OES sensor channel. Since endpoint data may also be found in other sensor channels (such as sensor channels providing data about temperature, pressure, voltage, and the like), an expert user may be facing an insurmountable task if every signal and combination of signals has to be analyzed.
As can be expected, depending upon applications, some signals may provide better endpoint data than other signals. For example, both signals A and B have endpoint data. However, signal B may provide a better endpoint signature since signal B may have less noise than signal A. Given that there may be dozens or hundreds of signals, the task of analyzing the data set for an endpoint signature, much less an optimal endpoint signature, may become a very tedious and time-consuming process.
In analyzing the data, the expert user may be looking for a signal change (e.g., change in a signal pattern) as an indication of an endpoint. For example, if a signal is sloping downward, a peak in the signal slope may represent a change. Although the task of manually identifying a signal change has been a tedious task in the past, in recent years, this task has become even more difficult as signal changes become less obvious. This is especially true for recipes that are employed to process small open areas on a substrate. In an example, an open area that is being processed (e.g., etched) is so small (e.g., <1% of the substrate area) that a signal change is so subtle so as to be almost unnoticeable to the human eyes.
To facilitate analysis, the expert user may eliminate data values that he believes to be not relevant in identifying an endpoint. One method for reducing the data set includes identifying and eliminating regions in the signal stream within which the expert user does not expect the endpoint to occur. In other words, the expert user may limit his search for an endpoint to a target area in the signal stream, usually between a pre-endpoint domain and a post-endpoint domain. Because of the high cost (in expert time) of finding and refining endpoint signatures, the aim is to make the pre-endpoint and post-endpoint domains as large as possible to limit the region left in which to look for endpoint.
Since the expert user is usually familiar with the process, the expert user may further reduce the data set by only analyzing select signals. The select signals may include signals or combination of signals that, based on the expert user's experience, may contain endpoint data. Typically, when a combination of signals is being analyzed as a group, the combination of signals is usually from a single sensor source. Generally, data from different sensor sources are not combined since variations between the sensors may make the correlation analysis difficult, if not impossible, to be performed manually.
As can be expected, working only with a filtered data set may increase the risk of the optimal endpoint signature being inadvertently eliminated. In other words, by filtering out the data, the expert user may be making an assumption that an endpoint signature, much less the optimal endpoint signature, is located in one of the signals that remains after filtering. For this reason, the endpoint signature that may be identified in the remaining signals may not necessarily be the optimal endpoint signature.
After a signal change has been identified, the expert user may perform a verification analysis to determine the robustness of the signal change as an endpoint candidate. For example, the expert user may analyze the history of the signal to determine the uniqueness of the signal change. If the signal change is not unique (i.e., occurring more than once in the history of the signal), the signal may be eliminated from the data set. The expert user may then resume his tedious task of identifying the “elusive” endpoint in another signal.
At a next step 106, a set of filters (such as a set of digital filters) may be applied to the data set to remove noise and smooth out the data. Examples of filters that may be applied include, but are not limited to, for example, time series filters and frequency-based filters. Although applying filters to a data set may decrease noise in the data set, filters are usually applied sparingly since filters may also increase the real-time delay within a signal.
In some situations, a multi-variate analysis (such as Principal Component Analysis or Partial Least Squares) may be performed in analyzing the data. The multi-variate analysis may be performed to further reduce the data set. In order to utilize the multi-variate analysis, the expert user may be required to define the shape (e.g., curve) of an endpoint feature. In other words, the expert user is expected to anticipate the shape of the endpoint even though an endpoint candidate may have yet to be identified. By predefining the shape of the endpoint, the multi-variate analysis essentially eliminates signals that do not exhibit the desired shape. In an example, if the shape of the endpoint is defined to be a peak, signals that do not exhibit this shape may be eliminated. Accordingly, if the optimal endpoint signature does not have the “expected” shape, the optimal endpoint signature may be missed.
As can be appreciated from the foregoing, the task of identifying a single endpoint signature from a plethora of data can be a daunting task and may take hours, if not weeks, to perform. Further, once an endpoint signature is identified, little or no quantitative analysis of the suitability of the signals or combination of signals as an endpoint signature may be performed. In an example, to validate a signal change as an endpoint signature, the expert user may analyze other signals to look for a similar signal change at around the same time frame. However, given that the expert user may have already spent a considerable amount of time identifying the first endpoint signature, the expert user may not always have the time, resource and/or inclination to validate the result.
At a next step 108, the expert user may choose an endpoint algorithm type based on the nature of the transition. Usually, the endpoint algorithm type may be based on the shape of the spectral line(s), for example, that may represent the endpoint. In an example, the endpoint may be represented by a slope change. Accordingly, the expert user may propose a slope dependent algorithm.
In addition, the endpoint algorithm may be based on the derivative that may provide the best endpoint signature. However, the first derivative (such as a change in the slope) of the endpoint signature may not provide the best endpoint algorithm. Instead, the second derivative of the slope (such as an inflection point), for example, may provide a better endpoint algorithm. The ability to identify not only an endpoint signature but also the best endpoint algorithm associated with the endpoint signature may require expertise that few users (even expert users) may possess.
At a next step 110, the algorithm settings may be optimized and/or tested. Once the endpoint algorithm has been identified, the endpoint algorithm may be converted into a production endpoint algorithm. Since differences may exist between the test environment and the production environment, the setting of the endpoint algorithm may have to be adjusted before the endpoint algorithm may be moved into production. Settings that may be adjusted include but are not limited to, for example, smoothing filters, delay time, specific settings for the algorithm types, and the like.
In an example, filters that may be employed to smooth the data in a test environment may cause unacceptable real-time delay within a production environment. As discussed herein, real-time delay refers to the time difference between a non-filtered signal change and a filtered signal change. For example, a peak in a signal may have occurred at 40 seconds into the process. However, after a filter is applied, the peak may not occur until 5 seconds later. If an endpoint algorithm is applied with the filter settings, the substrate may be over-etched before the endpoint algorithm identifies the endpoint. To minimize the real-time delay, the filters may have to be adjusted.
Before moving the endpoint algorithm into production, a test may be performed to determine if the settings have been optimized. In an example, the endpoint algorithm may be applied to the data set that has been utilized to create the endpoint algorithm. If the endpoint algorithm correctly identifies the endpoint using the adjusted settings, the settings may be considered as optimized. However, if the endpoint algorithm fails to correctly identify the endpoint, the settings may have to be adjusted. This test may have to be performed multiple times (through a trial and error method) before the settings may even be optimized.
At a next step 112, a determination is made in regard to performing a robustness test on the endpoint algorithm. If a robustness test is performed (step 114), the endpoint algorithm may be applied to data sets associated with other substrates. In an example, a second test substrate may be processed and data may be collected. The endpoint algorithm may then be applied to the second data set. If the endpoint algorithm is able to identify the endpoint, the endpoint algorithm may be considered robust and the endpoint algorithm may be migrated into production (step 116). However, if the endpoint algorithm fails to identify the endpoint, the endpoint algorithm may be considered to be not sufficiently robust and the expert user may return to step 104 to resume the task of identifying another endpoint candidate and constructing another endpoint algorithm.
Given that the robustness test may require time for execution and analysis, many endpoint algorithms may be migrated into the production environment without undergoing the robustness test. In other words, step 112 is usually considered as an optional step in the creation of an endpoint algorithm.
As can be appreciated from FIG. 1, the method for creating an endpoint algorithm is mostly a manual process that is usually performed by experts who may have the expertise and experience to perform the complex analysis. Given the restraint on resources, the endpoint algorithm that may be moved into production may lack quantitative support. Further, since a single human can not possibly analyze all signals and/or combination of signals within a reasonable period of time, the endpoint algorithm that may be created may not always be the optimal endpoint algorithm for the process.
Accordingly, a simplified method for constructing a robust endpoint algorithm is desirable.