The present invention relates to optical measurement of parameters of interest on samples having diffractive structures thereon, and in particular relates to improvements in real-time analysis of the measured optical signal characteristics from a sample to determine parameter values for that sample.
(This specification occasionally makes reference to prior published documents. A numbered list of these references can be found at the end of this section, under the sub-heading xe2x80x9cReferencesxe2x80x9d.)
In integrated circuit manufacture, the accurate measurement of the microstructures being patterned onto semiconductor wafers is highly desirable. Optical measurement methods are typically used for high-speed, non-destructive measurement of such structures. With such methods, a small spot on a measurement sample is illuminated with optical radiation comprising one or more wavelengths, and the sample properties over the measurement spot are determined by measuring characteristics of radiation reflected or diffracted by the sample (e.g., reflection intensity, polarization state, or angular distribution).
This disclosure relates to the measurement of a sample comprising a diffractive structure formed on or in a substrate, wherein lateral material inhomogeneities in the structure give rise to optical diffraction effects. If the lateral inhomogeneities are periodic with a period significantly smaller than the illuminating wavelengths, then diffracted orders other than the zeroth order may all be evanescent and not directly observable, or may be scattered outside the detection instrument""s field of view. But the lateral structure geometry can nevertheless significantly affect the zeroth-order reflectivity, making it possible to measure structure features much smaller than the illuminating wavelengths.
A variety of measurement methods applicable to diffractive structures are known in the prior art. Reference 7 reviews a number of these methods. The most straightforward approach is to use a rigorous, theoretical model based on Maxwell""s equations to calculate a predicted optical signal characteristic of the sample (e.g. reflectivity) as a function of sample measurement parameters (e.g., film thickness, linewidth, etc.), and adjust the measurement parameters in the model to minimize the discrepancy between the theoretical and measured optical signal (Ref""s 10, 14). (Note: In this context the singular term xe2x80x9ccharacteristicxe2x80x9d may denote a composite entity such as a vector or matrix. The components of the characteristic might, for example, represent reflectivities at different wavelengths or collection angles.) The measurement process comprises the following steps: First, a set of trial values of the measurement parameters is selected. Then, based on these values a computer-representable model of the measurement sample structure (including its optical materials and geometry) is constructed. The electromagnetic interaction between the sample structure and illuminating radiation is numerically simulated to calculate a predicted optical signal characteristic, which is compared to the measured signal characteristic. An automated fitting optimization algorithm iteratively adjusts the trial parameter values and repeats the above process to minimize the discrepancy between the measured and predicted signal characteristic. (The optimization algorithm might typically minimize the mean-square error of the signal characteristic components.)
The above process can provide very accurate measurement capability, but the computational burden of computing the structure geometry and applying electromagnetic simulation within the measurement optimization loop makes this method impractical for many real-time measurement applications. A variety of alternative approaches have been developed to avoid the computational bottleneck, but usually at the expense of compromised measurement performance.
One alternative approach is to replace the exact theoretical model with an approximate model that represents the optical signal characteristic as a linear function of measurement parameters over some limited parameter range. There are several variants of this approach, including Inverse Least Squares (ILS), Principal Component Regression (PCR), and Partial Least Squares (PLS) (Ref""s 1-5, 7, 11, 15). The linear coefficients of the approximate model are determined by a multivariate statistical analysis technique that minimizes the mean-square error between exact and approximate data points in a xe2x80x9ccalibrationxe2x80x9d data set. (The calibration data may be generated either from empirical measurements or from exact theoretical modeling simulations. This is done prior to measurement, so the calibration process does not impact measurement time.) The various linear models (ILS, PCR, PLS) differ in the type of statistical analysis method employed.
There are two fundamental limitations of the linear models: First, the linear approximation can only be applied over a limited range of measurement parameter values; and second, within this range the approximate model does not generally provide an exact fit to the calibration data points. (If the calibration data is empirically determined, one may not want the model to exactly fit the data, because the data could be corrupted by experimental noise. But if the data is determined from a theoretical model it would be preferable to use an approximation model that at least fits the calibration data points.) These deficiencies can be partially remedied by using a non-linear (e.g., quadratic) functional approximation (Ref. 7). This approach mitigates, but does not eliminate, the limitations of linear models.
The parameter range limit of functional (linear or non-linear) approximation models can be extended by the method of xe2x80x9crange splittingxe2x80x9d, wherein the full parameter range is split into a number of subranges, and a different approximate model is used for each subrange (Ref. 7). The method is illustrated conceptually in FIG. 1 (cf. FIG. 2 in Ref. 7), which represents the relationship between a measurement parameter x, such as a linewidth parameter, and an optical signal characteristic y, such as the zeroth-order sample reflectivity at a particular collection angle and wavelength. (In practice one is interested in modeling the relationship between multiple measurement parameters, such as linewidths, film thicknesses, etc., and multiple signal components, such as reflectivities at different wavelengths or collection angles. However, the concepts illustrated in FIG. 1 are equally applicable to the more general case.) A set of calibration data points (e.g., point 101) is generated, either empirically or by theoretical modeling. The x parameter range is split into two (or more) subranges 102 and 103, and the set of calibration points is separated into corresponding subsets 104 and 105, depending on which subrange each point is in. A statistical analysis technique is applied to each subset to generate a separate approximation model (e.g., a linear model) for each subrange, such as linear model 106 for subrange 102 and model 107 for subrange 103.
Aside from the limitations inherent in the functional approximation models, the range-splitting method has additional deficiencies. Although the functional approximation is continuous and smooth within each subrange, it may exhibit discontinuities between subranges (such as discontinuity 108 in FIG. 1). These discontinuities can create numerical instabilities in optimization algorithms that estimate measurement parameters from optical signal data. The discontinuities can also be problematic for process monitoring and control because small changes in process conditions could result in large, discontinuous jumps in measurements.
Another drawback of the range-splitting model is the large number of required calibration points and the large amount of data that must be stored in the model. In the FIG. 1 illustration, each subrange uses a simple linear approximation model of the form
y≅ax+bxe2x80x83xe2x80x83Eq. 1
wherein a and b are calibration coefficients. At least two calibration points per subrange are required to determine a and b (generally, more than two are used to provide good statistical sampling over each subrange), and two coefficients (a and b) must be stored for each subrange. If there are M subranges the total number of calibration points must be at least 2 M, and the number of calibration coefficients is 2 M. Considering a more general situation in which there are N measurement parameters X1, X2, . . . XN, the linear approximation would take the form
y≅a1x1+a2x2+ . . . aNxN+bxe2x80x83xe2x80x83Eq. 2
If the range of each parameter is split into M subranges, the number of separate linear approximation models required to cover all combinations of parameter subranges would be MN, and the number of calibration parameters per combination (a1, a2, . . . , aN, b) would be N+1. Thus the total number of calibration coefficients (and the minimum required number of calibration data points) would be (N+1) MN. For example, FIG. 2 illustrates a parameter space spanned by two parameters, x1 and x2. The x1 range is split into three subranges 201, 202, and 203, and the x2 subrange is split into three subranges 204, 205, and 206. For this case, N=2, M=3, the number of x1 and x2 subrange combinations 207 . . . 215 is 32=9, and the number of linear calibration coefficients would be (2+1) 32=27. Generalizing further, if the optical signal characteristic (y) comprises multiple signal components (e.g., for different wavelengths), the number of calibration coefficients will increase in proportion to the number of components. Furthermore, if a nonlinear (e.g., quadratic) subrange model is used, the number of calibration points and coefficients would be vastly larger.
Another measurement approach, Minimum Mean Square Error analysis (MMSE, Ref""s 2-9, 11, 13, 15), provides a simple alternative to the range splitting method described above. With this approach, a database of pre-computed theoretical optical signal characteristics representing a large variety of measurement structures is searched and compared to a samples"" measured optical signal, and the best-fitting comparison (in terms of a mean-square-error fitting criterion) determines the measurement result. (The above-noted references relate primarily to scatterometry and spectroscopy, but MMSE-type techniques have also been applied in the context of ellipsometry; see Ref""s. 12 and 16.) The MMSE method is capable of modeling strong nonlinearities in the optical signal. But this method, like range-splitting, can exhibit problematic discontinuities in the measurement results due to the database""s discrete parameter sampling.
All of these prior-art methods entail a compromise between measurement resolution and accuracy. The MMSE approach is not limited by any assumed functional form of the optical signal, and can therefore have good accuracy. But measurement resolution is fundamentally limited by the parameter sampling density. The functional approximation models, by contrast, are capable of xe2x80x9cinterpolatingxe2x80x9d between calibration data points, in the sense that the modeled signal is a continuous and smooth function of measurement parameters across the calibration range; hence such models can have essentially unlimited measurement resolution. However, the term xe2x80x9cinterpolationxe2x80x9d is a misnomer in this context because the functional models do not accurately fit the calibration data points, and their accuracy is limited by the misfit. (For example, Ref. 11 reports a fit accuracy of 5-10 nm for linewidth and thickness parameters.)
References
1. R. H. Krukar et al, xe2x80x9cUsing Scattered Light Modeling for Semiconductor Critical Dimension Metrology and Calibration,xe2x80x9d SPIE 1926, pp. 60-71 (1993).
2. C. J. Raymond et al, xe2x80x9cA scatterometric sensor for lithography,xe2x80x9d SPIE Proc. 2336, pp. 37-49 (1994).
3. C. J. Raymond et al, xe2x80x9cMetrology of subwavelength photoresist gratings using optical scatterometry,xe2x80x9d J. Vac. Sci. Technol. B, Vol. 13(4), pp. 1484-1495 (1995).
4. M. R. Murname et al, xe2x80x9cScatterometry for 0.24 um-0.70 um developed photoresist metrology,xe2x80x9d SPIE Proc. 2439, pp. 427-436 (1995).
5. M. R. Murname et al, xe2x80x9cSubwavelength photoresist grating metrology using scatterometry,xe2x80x9d SPIE Proc. 2532, pp. 251-261 (1995).
6. C. J. Raymond et al, xe2x80x9cMulti-parameter process metrology using scatterometry,xe2x80x9d SPIE Proc. 2638, pp. 84-93 (1995).
7. J. Bischoff et al, xe2x80x9cPhotoresist metrology based on light scattering,xe2x80x9d SPIE Proc. 2725, pp. 678-689 (1996).
8. C. J. Raymond et al, xe2x80x9cMulti-parameter CD measurements using scatterometry,xe2x80x9d SPIE Proc. 2725, pp. 698-709 (1996).
9. C. J. Raymond et al, xe2x80x9cScatterometry for CD measurements of etched structures,xe2x80x9d SPIE Proc. 2725, pp. 720-728 (1996).
10. B. K. Minhas et al, xe2x80x9cTowards sub-0.1 um CD measurements using scatterometry,xe2x80x9d SPIE Proc. 2725, pp. 729-739 (1996).
11. J. Bischoff et al, xe2x80x9cLight scattering based micrometrology,xe2x80x9d SPIE Proc. 2775, pp. 251-259 (1996).
12. Xinhui Niu, xe2x80x9cSpecular Spectroscopic Scatterometry in DUV Lithography,xe2x80x9d SPIE 3677, pp. 159-168 (1999).
13. J. Allgair et al, xe2x80x9cManufacturing Considerations for Implementation of Scatterometry for Process Monitoring,xe2x80x9d Proc. SPIE 3998, pp. 125-134 (2000).
14. Conrad, U. S. Pat. No. 5,963,329.
15. McNeil, U.S. Pat. No. 5,867,276.
16. Xu, WO 99/45340.
17. Handbook of Optics, Second Edition, Volume 2, Optical Society of America (1995).
18. xe2x80x9cFormulation and comparison of two recursive matrix algorithms for modeling layered diffraction gratingsxe2x80x9d, Journal of the Optical Society of America, Vol. A 13, No. 5, May 1996.
The invention is a method for measuring parameters of interest of a sample comprising a diffractive structure, wherein the method employs a database-search technique in combination with interpolation to avoid the tradeoff between measurement resolution and accuracy. Following is a summary outline of the steps of the method, which will later be individually described in more detail. (The steps need not be performed in the exact order indicated here, except to the extent that dependencies between steps constrain their order.)
First, a theoretical model is provided, from which a theoretical optical response characteristic of the diffractive structure is calculable as a function of a set of one or more xe2x80x9cinterpolation parametersxe2x80x9d corresponding to measurement parameters. The theoretical model comprises two primary components: a method for translating
any trial set of interpolation parameter values into a computer-representable model of the diffractive structure (including its optical materials and geometry), and a method for numerically simulating electromagnetic interactions within the diffractive structure to calculate the theoretical response characteristic.
Next, a database of xe2x80x9cinterpolation pointsxe2x80x9d and corresponding optical response characteristics is generated. Each interpolation point is defined by a specific interpolation parameter set consisting of specific values of the interpolation parameters. The theoretical model is applied to each interpolation point to calculate its corresponding theoretical optical response characteristic, which is stored in the database.
The database is used by an xe2x80x9cinterpolation modelxe2x80x9d, which calculates an interpolated optical response characteristic as a function of the interpolation parameter set. The interpolation model provides an approximation to the theoretical model, but without the computational overhead. Given any trial interpolation parameter set within a defined parameter domain, the interpolation model computes an approximate corresponding optical response characteristic by interpolating (or perhaps extrapolating) on the database. (The parameter domain is typically limited by the database, although extrapolation can sometimes be used to extend the domain outside of the database limits. The term xe2x80x9cinterpolationxe2x80x9d can be broadly construed herein to include extrapolation.) The diffractive structure""s internal geometry need not be modeled, and electromagnetic interactions within the structure need not be simulated, in the interpolation model. Thus the computational overhead of direct theoretical modeling of the diffractive structure is avoided. The interpolation model represents a substantially continuous function mapping the interpolation parameter set to the optical response characteristicxe2x80x94it does not exhibit the discontinuities or discretization of prior-art methods such as range-splitting and MMSE. Furthermore, although the interpolation is an approximation, the interpolated optical response characteristic accurately matches the theoretical optical response characteristic at the interpolation points represented in the database. Thus it does not suffer the accuracy limitation of prior-art functional approximation methods. (The term xe2x80x9cinterpolationxe2x80x9d broadly connotes a fitting function that fits the interpolation points. A portion of the fitting function might actually be extrapolated, so in this context the distinction between xe2x80x9cinterpolationxe2x80x9d and xe2x80x9cextrapolationxe2x80x9d is not significant.)
The interpolation model is used by a fitting optimization algorithm that determines measurement parameters of a sample based on a measured optical signal characteristic of the sample. The theoretical optical response characteristic, which is approximated by the interpolation model, does not necessarily correspond directly to the optical signal characteristic or to a measurable quantity. However, a predicted optical signal characteristic is calculable from the optical response characteristic by means of a computationally efficient algorithm that, like interpolation, does not require that the diffractive structure""s internal geometry be modeled or that electromagnetic interactions within the structure be simulated. The optimization algorithm automatically selects a succession of trial interpolation parameter sets, applies the interpolation model to calculate corresponding interpolated optical response characteristics, and from these calculates corresponding predicted optical signal characteristics, which are compared to the measured optical signal characteristic. The algorithm selects the trial parameter sets, based on a comparison error minimization method, to iteratively reduce a defined comparison error metric until a defined termination criterion is satisfied.
The measured optical signal characteristic is acquired with a measurement instrument comprising an optical sensor system, which detects radiation diffracted from the sample. The instrument further comprises computational hardware that applies the fitting optimization algorithm to measured signal data and generates measurement results. Subsequent to results generation, the instrument may also generate a computational or graphical representation of the diffractive structure""s geometry. However, this representation is not necessarily required to calculate a corresponding predicted optical response or signal characteristic, and it need not correspond to a particular parameter set in the database.