There is a great need in industries such as the semiconductor industry for sensitive metrology equipment that can provide high resolution and non-contact evaluation capabilities, particularly as the geometries of devices in these industries continue to shrink. Manufacturers have increasingly turned to optical metrology techniques, such as ellipsometry and reflectometry, which typically operate by illuminating a sample 106 with a mono- or polychromatic probe beam 116 of electromagnetic radiation, then detecting and analyzing the reflected and/or transmitted energy such as is shown in FIG. 1. Such methods are considered essential for the efficient operation of modern fabrication facilities, particularly for semiconductor fabrication involving the formation of a stack of thin film layers on a semiconductor substrate. These non-destructive optical methods are desirable because the resultant optical data can be used to derive information regarding layer parameters, such as thickness, optical constants, and scattering, for multiple layers of a thin film stack.
The light source 102 can be any appropriate light or radiation source capable of producing a probe beam 116 of polarized or unpolarized radiation, which can include one or more wavelengths of radiation in any of the appropriate radiation bands as known in the art. A focusing element 104 can focus the probe beam to a spot on the sample 106, such that the probe beam will be reflected from, or transmitted through, the surface of the sample. A number of collection optics, which can include elements such as lenses 108 and an aperture 118, can focus the reflected or transmitted portion of the beam onto a detector 110, which can measure at least one parameter of the light, such as the intensity of the reflected or transmitted beam. Ellipsometry techniques typically measure changes in the polarization state of the probe beam after interacting with the sample, while reflectometry techniques measure changes in the magnitude of the reflected probe beam. The detector 110 can generate an output signal, in response to the measurement, which is sent to a processor to determine at least one characteristic about the sample 106. Information such as a model of the sample can be stored in a database 114 and retrieved by the processor in order to make the determination. The diffraction, or optical scattering, of the probe beam due to the structural geometry of the sample can be measured, whereby details of the structure causing the diffraction can be determined. These measurements often are made using multiple tools in the fabrication process, and each measurement can involve different structures on the semiconductor device.
These metrology techniques can be used to analyze a wide range of parameters, such as the thickness, crystallinity, composition, and refractive index of a film on a silicon wafer, as well as attributes including critical dimension (CD), line spacing, line width, wall depth, and wall profile. Various optical metrology techniques have been used to obtain these measurements during processing, including broadband spectroscopy, described in U.S. Pat. Nos. 5,607,800, 5,867,276, and 5,963,329; spectral ellipsometry, described in U.S. Pat. No. 5,739,909; single-wavelength optical scattering, described in U.S. Pat. No. 5,889,593; and spectral and single-wavelength beam profile reflectance (BPR) and beam profile ellipsometry (BPE), described in U.S. Pat. No. 6,429,943. Any of these measurement technologies, such as single-wavelength laser BPR or BPE technologies, also can be used to obtain critical dimension (CD) measurements on non-periodic structures, such as isolated lines or isolated vias and mesas. The above cited patents and patent applications, as well as PCT Application Ser. No. WO 03/009063, U.S. application 2002/0158193 (now U.S. Pat. No. 6,819,426), U.S. application 2003/0147086 (now U.S. Pat. No. 6,813,034), U.S. application 2001/0051856 A1, PCT application WO 01/55669, and PCT application Ser. No. WO 01/97280, are each hereby incorporated herein by reference.
Exemplary metrology systems use three classes of parameters and relationships. Structures on a wafer have physical parameters, such as the thickness of a layer, the widths of a line structure at various heights (measured generally perpendicular to a face of the wafer), and the complex optical index of a material. Most scatterometry measurements, for example, are performed over a range of independent parameters, which can include parameters such as wavelength, angle of incidence, and azimuth angle. Unfortunately, it is not easy to associate a correct set of theoretical parameters with data measured using these devices. Given a set of “theoretical” parameters which might correspond to the actual parameters of the stack to be evaluated, one can program a processor, using equations such as Maxwell or Fresnel equations, for example, to derive a set of theoretical data based on these theoretical parameters. The derived theoretical data can be compared to the measured data and if there is a reasonable level of correspondence, one can assume that the generated theoretical parameters fairly describe the parameters of the thin film stack under investigation. Of course, it would be highly unlikely that the first set of generated theoretical parameters and the associated derived theoretical data would provide a good match to the actual measured data. A processor can generate thousands of sets of theoretical parameters using any of a number of algorithms.
Many metrology systems use a modeling approach to analyze empirical data by relating optical measurements to a model of what is on the wafer, where the model has parameters that in some sense reflect the physical parameters on the wafer. For such an approach, a theoretical model is typically defined for each subject that will be analyzed. The theoretical model predicts the parameters that would correspond to the detected optical signal. The theoretical model is parameterized and each parameter corresponds to a physical characteristic of the sample being measured, such as line width or layer thickness. A regression is performed in which the parameters are repeatedly perturbed and the model is repeatedly evaluated to minimize the differences or residuals between the modeled results and results that are empirically obtained, referred to as “minimizing the regression.” In many systems, the differences are calculated over the range of independent parameters, and an average difference, such as a squared sum, is calculated as a single difference. Various norms or other techniques are suitable for collapsing the multiple differences into a single working difference. When the residual minimization reaches some stopping criterion, the model and associated parameters are assumed to accurately reflect the subject being analyzed. One such stopping criterion is that the difference reaches some predetermined level, such as a minimum goodness-of-fit criterion. Another criterion is reached when the reduction of the difference becomes sufficiently small. In addition to residual values, confidence intervals in the model parameters and parameter correlation tables can serve as the basis for estimating the quality of the match between data calculated from a model and the experimental data, as well as for judging the validity of the model employed. A 90% confidence limit, for example, can express the sensitivity to a certain parameter, whereas the fir parameter correlation table describes the independence of the fit parameters. Other approaches are possible, such as those listed in U.S. Pat. No. 6,532,076 and U.S. Publication 2004/0032582 (now U.S. Pat. No. 6,947,135) each of which is hereby incorporated herein by reference.
Evaluation of these theoretical metrology models is a complex task, even for a relatively simple sample. As these samples become more complex, and have more parameters, the calculations become extremely time-consuming. Even with high-speed processors, real-time evaluation of these calculations can be difficult. This problem is exacerbated by the use of multiple tools in the fabrication process. Slight variations and noise differences between tools requires a regression for each tool that must account for every parameter, even if that parameter would not have changed between tools. These deficiencies are problematic in semiconductor manufacturing where it is often imperative to quickly detect processes that are not operating correctly. As the semiconductor industry moves towards integrated metrology solutions (i.e., where metrology hardware is integrated directly with process hardware) the need for rapid evaluation becomes even more acute.
For example, optical metrology systems utilizing broadband light often combine the outputs of two or more light sources or bulbs in order to obtain a probe beam with suitable broadband characteristics. For example, three lamps can be used to generate a probe beam that spans a wavelength range from about 185 nm to about 900 nm. A tungsten lamp is often used due to the associated output range from the visible to near infrared spectrum, a deuterium bulb is often used for the associated deep ultraviolet (DUV) output, and a xenon bulb is often used for the associated deep ultraviolet to near infrared output spectrum. One problem with using multiple light sources is that it can be difficult to account for slight variations in the beams produced from each source, such as the azimuth angle at which each beam is incident upon the sample being measured. Slight variations in azimuth angle can affect the measured values, such that each azimuth angle must be separately determined and accounted for in the measurement. Further, when the sample being processed is moved to another tool, the azimuth angle for each tool will be slightly different even if the same types and arrangements of light sources are used. These differences must again be accounted for in the measurement of sample parameters.
A number of approaches have been developed to overcome the calculation bottleneck associated with the analysis of metrology results. Many of these approaches involve techniques for improving calculation throughput, such as distributed processing techniques. For example, a master processor can be used that distributes scatterometry calculations among a group of slave processors, such as is described in U.S. Pat. No. 6,704,661, which is hereby incorporated herein by reference. This can be done by as a function of wavelength, for example, so that each slave processor evaluates the theoretical model for selected wavelengths. The other slave processors will carry out the same calculations at different wavelengths. Once complete, the master processor combines the separate calculations and performs the best fit comparison to the empirical results. Based on this fit, the master processor will modify the parameters of the model (e.g. changing the widths or layer thickness) and distribute the calculations for the modified model to the slave processors. This sequence is repeated until a good fit is achieved. Such a distributed processing approach can be used with other types of information, such as with multiple angle of incidence information Techniques of this type can reduce the time required for scatterometry calculations, but as the complexity of the geometry increases the computational complexity requires more than the use of distributed techniques alone.
Another approach used for rapidly evaluating metrology measurements is to use pre-computed libraries of predicted measurements. This type of approach is discussed, for example, in PCT application WO 99/45340, published Sep. 10, 1999, which is hereby incorporated herein by reference. In this approach, a library of expected results is constructed by repeatedly evaluating the theoretical model for range of different parameters. When empirical measurements are obtained, the library is searched to find the best fit. The use of libraries speeds the analysis process by allowing theoretical results to be computed once and reused many times. Of course, libraries are necessarily limited in their resolution and can contain only a finite number of theoretical results. Further, libraries cannot account for changes over time. This means that there are many cases where empirical measurements do not have exact library matches. In these cases, the use of a library represents an undesirable choice between speed and computational accuracy.
In order to overcome this limitation, U.S. Pat. No. 6,768,967, incorporated herein by reference, describes an approach using a database method of analysis for empirical metrology measurements. The database method is similar to the library approach in that the method relies on a stored set of pre-computed “reflectance characteristics.” In this case, however, an interpolation method is used in combination with the database-stored characteristics, making it possible to obtain measurement resolution and accuracy much better than the database sampling resolution. Both the database size and computation time are consequently greatly reduced relative to library-based methods. A critical element of the database interpolation method is the interpolation algorithm itself. Two preferred algorithms are described, namely multi-linear and multi-cubic interpolation. Multi-linear interpolation is very fast, but has poor accuracy. Multi-cubic interpolation is much more accurate, but can be slow, especially when many parameters are being simultaneously measured. In practice, selection of a multi-linear or multi-cubic method is based upon the degree of accuracy and speed required. While this choice may be acceptable when the number of parameters is relatively small, increased speed and accuracy are needed for more complex systems and/or samples.