Manufacturing semiconductor devices involves depositing and patterning several layers overlaying each other. For example, gate interconnects and gates of a CMOS integrated circuit have layers with different patterns, which are produced by different lithography stages. The tolerance of alignment of the patterns at each of these layers can be smaller than the width of the gate. At the time of this writing, the smallest linewidth that can be mass produced is 130 nm. The state of the art mean +3σ alignment accuracy is 30 nm (Nikon KrF Step-and-Repeat Scanning System NSR-S205C, July 2000).
Overlay metrology is the art of checking the quality of alignment after lithography. Overlay error is defined as the offset between two patterned layers from their ideal relative position. Overlay error is a vector quantity with two components in the plane of the wafer. Perfect overlay and zero overlay error are used synonymously. Depending on the context, overlay error may signify one of the components or the magnitude of the vector.
Overlay metrology saves subsequent process steps that would be built on a faulty foundation in case of an alignment error. Overlay metrology provides the information that is necessary to correct the alignment of the stepper-scanner and thereby minimize overlay error on subsequent wafers. Moreover, overlay errors detected on a given wafer after exposing and developing the photoresist can be corrected by removing the photoresist and repeating the lithography step on a corrected stepper-scanner. If the measured error is minor, parameters for subsequent steps of the lithography process could be adjusted based on the overlay metrology to avoid excursions. If overlay error is measured subsequently, e.g., after the etch step that typically follows develop, it can be used to “scrap” severely mis-processed wafers, or to adjust process equipment for better performance on subsequent wafers.
Prior overlay metrology methods use built-in test patterns etched or otherwise formed into or on the various layers during the same plurality of lithography steps that form the patterns for circuit elements on the wafer. One typical pattern, called “box-in-box” consists of two concentric squares, formed on a lower and an upper layer, respectively. “Bar-in-bar” is a similar pattern with just the edges of the “boxes” demarcated, and broken into disjoint line segments, as shown in FIG. 1. The outer bars 2 are associated with one layer and the inner bars 4 with another. Typically one is the upper pattern and the other is the lower pattern, e.g., outer bars 2 on a lower layer, and inner bars 4 on the top. However, with advanced processes the topographies are complex and not truly planar so the designations “upper” and “lower” are ambiguous. Typically they correspond to earlier and later in the process. There are other patterns used for overlay metrology. The squares or bars are formed by lithographic and other processes used to make planar structures, e.g., chemical-mechanical planarization (CMP). Currently, the patterns for the boxes or bars are stored on lithography masks and projected onto the wafer. Other methods for putting the patterns on the wafer are possible, e.g., direct electron beam writing from computer memory, etc.
In one form of the prior art, a high performance microscope imaging system combined with image processing software estimates overlay error for the two layers. The image processing software uses the intensity of light at a multitude of pixels. Obtaining the overlay error accurately requires a high quality imaging system and means of focusing it. Some of this prior art is reviewed by the article “Semiconductor Pattern Overlay”, by Neal T. Sullivan, Handbook of Critical Dimension Metrology and Process Control: Proceedings of Conference held 28–29 Sep. 1993, Monterey, Calif. Kevin M. Monahan, ed., SPIE Optical Engineering Press, vol. CR52, pp. 160–188. A. Starikov, D. J. Coleman, P. J. Larson, A. D. Lapata, W. A. Muth, in “Accuracy of Overlay Measurements: Tool and Mark Asymmetry Effects,” Optical Engineering, vol. 31, 1992, p. 1298, teach measuring overlay at one orientation, rotating the wafer by 180°, measuring overlay again and attributing the difference to tool errors and overlay mark asymmetry.
One requirement for the optical system is very stable positioning of the optical system with respect to the sample. Relative vibration would blur the image and degrade the performance. This is a difficult requirement to meet for overlay metrology systems that are integrated into a process tool, like a lithography track. The tool causes potentially large accelerations (vibrations), e.g., due to high acceleration wafer handlers. The tight space requirements for integration preclude bulky isolation strategies.
The imaging-based overlay measurement precision can be two orders of magnitude smaller than the wavelength of the light used to image the target patterns of concentric boxes or bars. At such small length scales, the image does not have well determined edges because of diffraction. The determination of the edge, and therefore the overlay measurement, is affected by any factor that changes the diffraction pattern. Chemical-mechanical planarization (CMP) is a commonly used technique used to planarize the wafer surface at intermediate process steps before depositing more material. CMP can render the profile of the trenches or lines that make up the overlay measurement targets asymmetric. FIG. 2 illustrates an overlay target feature 2 which is a trench filled with metal. Surface 3 is planarized by CMP. The CMP process erodes the surface of the overlay mark 2 in an asymmetric manner. The overlay target 2 is compared subsequently to target feature 4 in the overlying layer, which could be, e.g., photoresist of the next lithography step. The asymmetry in target feature 2 changes the diffraction pattern, thus potentially causing an overlay measurement error.
In U.S. Pat. No. 4,757,207, Chappelow, et al. teach obtaining the quantitative value of the overlay offset from the reflectance of targets that consists of identical line gratings that are overlaid upon each other on a planar substrate. Each period of the target consists of four types of film stacks: lines of the lower grating overlapping with the spaces of the upper grating, spaces of the lower grating overlapping with the lines of the upper grating, lines of the lower and upper gratings overlapping, spaces of the lower and upper gratings overlapping. Chappelow et al. approximate the reflectance of the overlapping gratings as the average of the reflectances of the four film stacks weighted by their area-fractions. This approximation, which neglects diffraction, has some validity when the lines and spaces are larger than largest wavelength of the reflectometer. The reflectance of each of the four film stacks is measured at a so called macro-site close to the overlay target. Each macro-site has a uniform film stack over a region that is larger than the measurement spot of the reflectometer. A limitation of U.S. Pat. No. 4,757,207 is that spatial variations in the film thickness that are caused by CMP and resist loss during lithography will cause erroneous overlay measurements. Another limitation of U.S. Pat. No. 4,757,207 is that reflectance is measured at eight sites in one overlay metrology target, which increases the size of the target and decreases the throughput of the measurement. Another limitation of U.S. Pat. No. 4,757,207 is that the lines and spaces need to be large compared to the wavelength, but small compared to the measurement spot which limits the accuracy and precision of the measurement. Another limitation of U.S. Pat. No. 4,757,207 is that the light intensity is measured by a single photodiode. The dependence of the optical properties of the sample is not measured as a function of wavelength, or angle of incidence, or polarization, which limits the precision of the measurement.
The “average reflectivity” approximation for the interaction of light with gratings, as employed by U.S. Pat. No. 4,757,207, greatly simplifies the problem of light interaction with a grating but neglects much of the diffraction physics. The model used to interpret the data has “four distinct regions whose respective reflectivities are determined by the combination of layers formed by the substrate and the overlaid patterns and by the respective materials in the substrate and patterns.” Eq. 1 in the patent clearly indicates that these regions do not interact, i.e., via diffraction, as the total reflectivity of the structure is a simple average of the four reflectivities with area weighting.
IBM Technical Disclosure Bulletin 90A 60854/GE8880210, March 1990, pp 170–174, teaches measuring offset between two patterned layers by overlapping gratings. There are four sets of overlapping gratings to measure the x-offset and another four sets of overlapping gratings to measure the y-offset. The four sets of gratings, which are measured by a spectroscopic reflectometer, have offset biases of 0, ¼, ½, ¾-pitch. The spectra are differenced as Sa=S0−S½, Sb=S¼−S¾; a weighted average of the difference spectra is evaluated: Ia=<w,Sa>, Ib=<w,Sb>, where w is a weighting function; and the ratio min(Ia,Ib)/max(Ia,Ib) is used to look up the offset/pitch ratio from a table. GE8880210 relies on “well known film thickness algorithms” to model the optical interactions. Such algorithms treat the electromagnetic boundary conditions at the interfaces between the planar layers or films. If the direction perpendicular to the films is the z direction, the boundaries between the films are at constant z=zn, where zn is the location of the nth boundary. Such algorithms, and hence GE880210, do not use a model that accounts for the diffraction of light by the gratings or the multiple scattering of the light by the two gratings, and it has no provision to handle non-rectangular line profiles.
In U.S. Pat. No. 6,150,231, Muller et al. teach measuring overlay by Moire patterns. The Moire pattern is formed by overlapping gratings patterns, one grating on the lower level, another on the upper level. The two grating patterns have different pitches. The Moire pattern approach requires imaging the overlapping gratings and estimating their offset from the spatial characteristics of the image.
In U.S. Pat. Nos. 6,023,338 and 6,079,256, Bareket teaches an alternative approach in which two complementary periodic grating structures are produced on the two subsequent layers that require alignment. The two periodic structures are arranged adjacent to and in fixed positions relative to one another, such that there is no overlap of the two structures. The two gratings are scanned, either optically or with a stylus, so as to detect the individual undulations of the gratings as a function of position. The overlay error is obtained from the spatial phase shift between the undulations of the two gratings.
Smith et al. in U.S. Pat. No. 4,200,395, and Ono in U.S. Pat. No. 4,332,473 teach aligning a wafer and a mask by using overlapping diffraction gratings and measuring higher order, i.e., non-specular, diffracted light. One diffraction grating is on the wafer and another one is on the mask. The overlapping gratings are illuminated by a normally incident light and the intensities of the positive and negative diffracted orders, e.g. 1st and −1st orders, are compared. The difference between the intensities of the 1st and −1st diffracted orders provides a feedback signal which can be used to align the wafer and the mask. These inventions are similar to the present one in that they use overlapping gratings on two layers. However, the U.S. Pat. Nos. 4,200,395 and 4,332,473 patents are applicable to mask alignment but not to overlay metrology. They do not teach how to obtain the quantitative value of the offset from the light intensity measurements. U.S. Pat. Nos. 4,200,395 and 4,332,473 are not applicable to a measurement system that only uses specular, i.e., zeroth-order diffracted light.
This invention is distinct from the prior art in that it teaches measuring overlay by scatterometry. Measurements of structural parameters of a diffracting structure from optical characterization are now well known in the art as scatterometry. With such methods, a measurement sample is illuminated with optical radiation, and the sample properties are determined by measuring characteristics of the scattered radiation (e.g., intensity, phase, polarization state, or angular distribution). A diffracting structure consists of one or more layers that may have lateral structure within the illuminated and detected area, resulting in diffraction of the reflected (or transmitted) radiation. If the lateral structure dimensions are smaller than the illuminating wavelengths, then diffracted orders other than the zeroth order may all be evanescent and not directly observable. But the structure geometry can nevertheless significantly affect the zeroth-order reflection, making it possible to make optical measurements of structural features much smaller than the illuminating wavelengths.
In one type of measurement process, a microstructure is illuminated and the intensity of reflected or diffracted radiation is detected as a function of the radiation's wavelength, the incidence direction, the collection direction, or polarization state (or a combination of such factors). Direction is typically specified as a polar angle and azimuth, where the reference for the polar angle is the normal to the wafer and the reference for the azimuth is either some pattern(s) on the wafer or other marker, e.g., a notch or a flat for silicon wafers. The measured intensity data is then passed to a data processing machine that uses some model of the scattering from possible structures on the wafer. For example, the model may employ Maxwell's equations to calculate the theoretical optical characteristics as a function of measurement parameters (e.g., film thickness, line width, etc.), and the parameters are adjusted until the measured and theoretical intensities agree within specified convergence criteria. The initial parameter estimates may be provided in terms of an initial “seed” model of the measured structure. Alternatively, the optical model may exist as pre-computed theoretical characteristics as a function of one or more discretized measurement parameters, i.e., a “library”, that associates collections of parameters with theoretical optical characteristics. The “extracted” structural model has the structural parameters associated with the optical model which best fits the measured characteristics, e.g., in a least-squares sense.
Conrad (U.S. Pat. No. 5,963,329) is an example of the application of scatterometry to measure the line profile or topographical cross-sections. The direct application of Maxwell's equations to diffracting structures, in contrast to non-diffracting structures (e.g., unpatterned films), is much more complex and time-consuming, possibly resulting in either a considerable time delay between data acquisition and result reporting and/or the need to use a physical model of the profile which is very simple and possibly neglects significant features.
Scheiner et al. (U.S. Pat. No. 6,100,985) teaches a measurement method that is similar to that of Conrad, except that Scheiner's method uses a simplified, approximate optical model of the diffracting structure that does not involve direct numerical solution of Maxwell's equations. This avoids the complexity and calculation time of the direct numerical solution. However, the approximations inherent in the simplified model make it inadequate for grating structures that have period and linewidth dimensions comparable to or smaller than the illumination wavelengths.
In an alternative method taught by McNeil et al. (U.S. Pat. No. 5,867,276) the calculation time delay is substantially reduced by storing a multivariate statistical analysis model based on calibration data from a range of model structures. The calibration data may come from the application of Maxwell's equations to parameterized models of the structure. The statistical analysis, e.g., as taught in chemometrics, is applied to the measured diffraction characteristics and returns estimates of the parameters for the actual structure.
The measurement method taught by McNeil uses diffraction characteristics consisting of spectroscopic intensity data. A similar method can also be used with ellipsometric data, using ellipsometric parameters such as tan ψ, cos Δ in lieu of intensity data. For example, Xinhui Niu in “Specular Spectroscopic Scatterometry in DUV Lithography,” Proc. SPIE, vol. 3677, pp. 159–168, 1999, uses a library approach. The library method can be used to simultaneously measure multiple model parameters (e.g. linewidth, edge slope, film thickness).
In International (PCT) application publication No. WO 99/45340 (KLA-Tencor), Xu et al. disclose a method for measuring the parameters of a diffracting structure on top of laterally homogeneous, non-diffracting films. The disclosed method first constructs a reference database based on a priori information about the refractive index and film thickness of underlying films, e.g., from spectroscopic ellipsometry or reflectometry. The “reference database” has “diffracted light fingerprints” or “signatures” (either diffraction intensities, or alternatively ellipsometric parameters) corresponding to various combinations of grating shape parameters. The grating shape parameters associated with the signature in the reference database that matches the measured signature of the structure are then reported as the grating shape parameters of the structure.
Definition of Terms
An unbounded periodic structure is one that is invariant under a nonzero translation in a direction when there exists a minimum positive invariant translation in the said direction. Here we are concerned with structures that are periodic in directions (substantially) parallel to the surface of a wafer. Here ‘wafer’ is used to mean any manufactured object that is built by building up patterned, overlying layers. Silicon wafers for microelectronics are a good example, and there are many others, e.g., flat panel displays.
A one-dimensional (1D) periodic structure has one direction in which it is invariant for any translation. The lattice dimension is perpendicular to the invariant direction. The smallest distance of translation along the lattice dimension which yields invariance is the pitch of the grating. Two-dimensional gratings are also possible, with two lattice directions and pitches, as is well known. In this application, a periodic structure is understood to be a portion of an unbounded periodic structure. The periodic structure is understood to extend by more than one period along its lattice axes. A grating is a periodic structure. A diffraction grating is a grating used in a manner to interact with waves, in particular light waves. A 1D grating is also referred to as a “line grating”.
Upon reflection by or transmission through a diffraction grating, light propagates in discrete directions called Bragg orders. For a particular Bragg order m, the component of the wavevector along the lattice axis, kxm, differs from the same component of the wavevector of the incident wave by an integer multiple of the lattice wavenumber 2π/P. For a line grating,
                                          k                          x              ,              m                                =                    ⁢                                                                      2                  ⁢                  m                  ⁢                                                                          ⁢                  π                                P                            +                                                                    2                    ⁢                                          πsinθ                      I                                                        λ                                ⁢                                                                  ⁢                m                                      =            0                          ,                  ±          1                ,                  ±          2                ,        …                                          k                      z            ,            m                    2                =                ⁢                                            (                                                2                  ⁢                  n                  ⁢                                                                          ⁢                  π                                λ                            )                        2                    -                      k                          x              ,              m                        2                              where λ and θI are the wavelength and angle of the incident wave in vacuum (or something effectively like vacuum, e.g., air), n is the refractive index of the transparent medium that separates the two gratings. P is the pitch of the grating. The x-axis is the lattice axis and the z-axis is perpendicular to the plane of the wafer. The Bragg orders are referenced by the integer m. The Bragg orders for which kz2<0 are called evanescent, non-propagating, or cut-off. The evanescent Bragg orders have pure imaginary wavenumbers in the z direction. Hence, they exponentially decay as exp(−|Im(kz)|z) as a function of the distance z, measured from the diffraction grating along the z-axis.
The polar angle θ and azimuth φ are defined as shown in FIG. 3, with respect to the lateral or in-plane directions x and y, and the vertical or out of plane direction z. The figure applies generally to objects that are substantially planar, or locally to curved objects. The orientation of the lateral directions x and y may correspond to physical features on the wafer, e.g. structures 5 deposited or formed on the wafer (substrate), or actually part of the substrate, e.g., a wafer notch.
The spot of an optical instrument is the region on a sample whose optical characteristics are detected by the instrument. The measurement system can translate the location of the spot on the sample, and focus it, as is well known in the art.