This invention relates to measuring the pattern overlay alignment accuracy of a pair of patterned layers on a semiconductor wafer, possibly separated by one or more layers, made by two or more lithography steps during the manufacture of semiconductor devices.
Manufacturing semiconductor devices involves depositing and patterning several layers overlaying each other. For example, gate interconnects and gates of a CMOS integrated circuit have layers with different patterns, which are produced by different lithography stages. The tolerance of alignment of the patterns at each of these layers can be smaller than the width of the gate. At the time of this writing, the smallest linewidth that can be mass produced is 130 nm. The state of the art mean +3"sgr" alignment accuracy is 30 nm (Nikon KrF Step-and-Repeat Scanning System NSR-S205C, July 2000).
Overlay metrology is the art of checking the quality of alignment after lithography. Overlay error is defined as the offset between two patterned layers from their ideal relative position. Overlay error is a vector quantity with two components in the plane of the wafer. Perfect overlay and zero overlay error are used synonymously. Depending on the context, overlay error may signify one of the components or the magnitude of the vector.
Overlay metrology saves subsequent process steps that would be built on a faulty foundation in case of an alignment error. Overlay metrology provides the information that is necessary to correct the alignment of the stepper-scanner and thereby minimize overlay error on subsequent wafers. Moreover, overlay errors detected on a given wafer after exposing and developing the photoresist can be corrected by removing the photoresist and repeating the lithography step on a corrected stepper-scanner. If the measured error is minor, parameters for subsequent steps of the lithography process could be adjusted based on the overlay metrology to avoid excursions. If overlay error is measured subsequently, e.g., after the etch step that typically follows develop, it can be used to xe2x80x9cscrapxe2x80x9d severely mis-processed wafers, or to adjust process equipment for better performance on subsequent wafers.
Prior overlay metrology methods use built-in test patterns etched or otherwise formed into or on the various layers during the same plurality of lithography steps that form the patterns for circuit elements on the wafer. One typical pattern, called xe2x80x9cbox-in-boxxe2x80x9d consists of two concentric squares, formed on a lower and an upper layer, respectively. xe2x80x9cBar-in-barxe2x80x9d is a similar pattern with just the edges of the xe2x80x9cboxesxe2x80x9d demarcated, and broken into disjoint line segments, as shown in FIG. 1. The outer bars 2 are associated with one layer and the inner bars 4 with another. Typically one is the upper pattern and the other is the lower pattern, e.g., outer bars 2 on a lower layer, and inner bars 4 on the top. However, with advanced processes the topographies are complex and not truly planar so the designations xe2x80x9cupperxe2x80x9d and xe2x80x9clowerxe2x80x9d are ambiguous. Typically they correspond to earlier and later in the process. There are other patterns used for overlay metrology. The squares or bars are formed by lithographic and other processes used to make planar structures, e.g., chemical-mechanical planarization (CMP). Currently, the patterns for the boxes or bars are stored on lithography masks and projected onto the wafer. Other methods for putting the patterns on the wafer are possible, e.g., direct electron beam writing from computer memory, etc.
In one form of the prior art, a high performance microscope imaging system combined with image processing software estimates overlay error for the two layers. The image processing software uses the intensity of light at a multitude of pixels. Obtaining the overlay error accurately requires a high quality imaging system and means of focusing it. Some of this prior art is reviewed by the article xe2x80x9cSemiconductor Pattern Overlayxe2x80x9d, by Neal T. Sullivan, Handbook of Critical Dimension Metrology and Process Control: Proceedings of Conference held 28-29 Sep. 1993, Monterey, Calif., Kevin M. Monahan, ed., SPIE Optical Engineering Press, vol. CR52, pp. 160-188. A. Starikov, D. J. Coleman, P. J. Larson, A. D. Lapata, W. A. Muth, in xe2x80x9cAccuracy of Overlay Measurements: Tool and Mark Asymmetry Effects,xe2x80x9d Optical Engineering, vol. 31, 1992, p. 1298, teach measuring overlay at one orientation, rotating the wafer by 180xc2x0, measuring overlay again and attributing the difference to tool errors and overlay mark asymmetry.
One requirement for the optical system is very stable positioning of the optical system with respect to the sample. Relative vibration would blur the image and degrade the performance. This is a difficult requirement to meet for overlay metrology systems that are integrated into a process tool, like a lithography track. The tool causes potentially large accelerations (vibrations), e.g., due to high acceleration wafer handlers. The tight space requirements for integration preclude bulky isolation strategies.
The imaging-based overlay measurement precision can be two orders of magnitude smaller than the wavelength of the light used to image the target patterns of concentric boxes or bars. At such small length scales, the image does not have well determined edges because of diffraction. The determination of the edge, and therefore the overlay measurement, is affected by any factor that changes the diffraction pattern. Chemical-mechanical planarization (CMP) is a commonly used technique used to planarize the wafer surface at intermediate process steps before depositing more material. CMP can render the profile of the trenches or lines that make up the overlay measurement targets asymmetric. FIG. 2 illustrates an overlay target feature 2 which is a trench filled with metal. Surface 3 is planarized by CMP. The CMP process erodes the surface of the overlay mark 2 in an asymmetric manner. The overlay target 2 is compared subsequently to target feature 4 in the overlying layer, which could be, e.g., photoresist of the next lithography step. The asymmetry in target feature 2 changes the diffraction pattern, thus potentially causing an overlay measurement error.
In U.S. Pat. No. 4,757,207, Chappelow, et al. teach obtaining the quantitative value of the overlay offset from the reflectance of targets that consists of identical line gratings that are overlaid upon each other on a planar substrate. Each period of the target consists of four types of film stacks: lines of the lower grating overlapping with the spaces of the upper grating, spaces of the lower grating overlapping with the lines of the upper grating, lines of the lower and upper gratings overlapping, spaces of the lower and upper gratings overlapping. Chappelow et al, approximate the reflectance of the overlapping gratings as the average of the reflectances of the four film stacks weighted by their area-fractions. This approximation, which neglects diffraction, has some validity when the lines and spaces are larger than largest wavelength of the reflectometer. The reflectance of each of the four film stacks is measured at a so called macro-site close to the overlay target. Each macro-site has a uniform film stack over a region that is larger than the measurement spot of the reflectometer. A limitation of U.S. Pat. No. 4,757,207 is that spatial variations in the film thickness that are caused by CMP and resist loss during lithography will cause erroneous overlay measurements. Another limitation of U.S. Pat. No. 4,757,207 is that reflectance is measured at eight sites in one overlay metrology target, which increases the size of the target and decreases the throughput of the measurement. Another limitation of U.S. Pat. No. 4,757,207 is that the lines and spaces need to be large compared to the wavelength, but small compared to the measurement spot which limits the accuracy and precision of the measurement. Another limitation of U.S. Pat. No. 4,757,207 is that the light intensity is measured by a single photodiode. The dependence of the optical properties of the sample is not measured as a function of wavelength, or angle of incidence, or polarization, which limits the precision of the measurement.
The xe2x80x9caverage reflectivityxe2x80x9d approximation for the interaction of light with gratings, as employed by U.S. Pat. No. 4,757,207, greatly simplifies the problem of light interaction with a grating but neglects much of the diffraction physics. The model used to interpret the data has xe2x80x9cfour distinct regions whose respective reflectivities are determined by the combination of layers formed by the substrate and the overlaid patterns and by the respective materials in the substrate and patterns.xe2x80x9d Eq. 1 in the patent clearly indicates that these regions do not interact, i.e., via diffraction, as the total reflectivity of the structure is a simple average of the four reflectivities with area weighting.
IBM Technical Disclosure Bulletin 90A 60854/GE8880210, March 1990, pp 170-174, teaches measuring offset between two patterned layers by overlapping gratings. There are four sets of overlapping gratings to measure the x-offset and another four sets of overlapping gratings to measure the y-offset. The four sets of gratings, which are measured by a spectroscopic reflectometer, have offset biases of 0, xc2xc, xc2xd, xc2xe-pitch. The spectra are differenced as Sa=S0-Sxc2xd, Sb=S{fraction (1/4)}-Sxc2xe; a weighted average of the difference spectra is evaluated: Ia= less than w,Sa greater than , Ib= less than w,Sb greater than , where w is a weighting function; and the ratio min(Ia, Ib)/max(Ia, Ib) is used to look up the offset/pitch ratio from a table. GE8880210 relies on xe2x80x9cwell known film thickness algorithmsxe2x80x9d to model the optical interactions. Such algorithms treat the electromagnetic boundary conditions at the interfaces between the planar layers or films. If the direction perpendicular to the films is the z direction, the boundaries between the films are at constant z=zn, where zn is the location of the nth boundary. Such algorithms, and hence GE880210, do not use a model that accounts for the diffraction of light by the gratings or the multiple scattering of the light by the two gratings, and it has no provision to handle non-rectangular line profiles.
In U.S. Pat. No. 6,150,231, Muller et al. teach measuring overlay by Moire patterns. The Moire pattern is formed by overlapping gratings patterns, one grating on the lower level, another on the upper level. The two grating patterns have different pitches. The Moirxc3xa9 pattern approach requires imaging the overlapping gratings and estimating their offset from the spatial characteristics of the image.
In U.S. Pat. Nos. 6,023,338 and 6,079,256, Bareket teaches an alternative approach in which two complementary periodic grating structures are produced on the two subsequent layers that require alignment. The two periodic structures are arranged adjacent to and in fixed positions relative to one another, such that there is no overlap of the two structures. The two gratings are scanned, either optically or with a stylus, so as to detect the individual undulations of the gratings as a function of position. The overlay error is obtained from the spatial phase shift between the undulations of the two gratings.
Smith et al. in U.S. Pat. No. 4,200,395, and Ono in U.S. Pat. No. 4,332,473 teach aligning a wafer and a mask by using overlapping diffraction gratings and measuring higher order, i.e., non-specular, diffracted light. One diffraction grating is on the wafer and another one is on the mask. The overlapping gratings are illuminated by a normally incident light and the intensities of the positive and negative diffracted orders, e.g. 1st and xe2x88x921st orders, are compared. The difference between the intensities of the 1st and xe2x88x921st diffracted orders provides a feedback signal which can be used to align the wafer and the mask. These inventions are similar to the present one in that they use overlapping gratings on two layers. However, the U.S. Pat. Nos. 4,200,395 and 4,332,473 patents are applicable to mask alignment but not to overlay metrology. They do not teach how to obtain the quantitative value of the offset from the light intensity measurements. U.S. Pat. Nos. 4,200,395 and 4,332,473 are not applicable to a measurement system that only uses specular, i.e., zeroth-order diffracted light.
This invention is distinct from the prior art in that it teaches measuring overlay by scatterometry. Measurements of structural parameters of a diffracting structure from optical characterization are now well known in the art as scatterometry. With such methods, a measurement sample is illuminated with optical radiation, and the sample properties are determined by measuring characteristics of the scattered radiation (e.g., intensity, phase, polarization state, or angular distribution). A diffracting structure consists of one or more layers that may have lateral structure within the illuminated and detected area, resulting in diffraction of the reflected (or transmitted) radiation. If the lateral structure dimensions are smaller than the illuminating wavelengths, then diffracted orders other than the zeroth order may all be evanescent and not directly observable. But the structure geometry can nevertheless significantly affect the zeroth-order reflection, making it possible to make optical measurements of structural features much smaller than the illuminating wavelengths.
In one type of measurement process, a microstructure is illuminated and the intensity of reflected or diffracted radiation is detected as a function of the radiation is wavelength, the incidence direction, the collection direction, or polarization state (or a combination of such factors). Direction is typically specified as a polar angle and azimuth, where the reference for the polar angle is the normal to the wafer and the reference for the azimuth is either some pattern(s) on the wafer or other marker, e.g., a notch or a flat for silicon wafers. The measured intensity data is then passed to a data processing machine that uses some model of the scattering from possible structures on the wafer. For example, the model may employ Maxwell""s equations to calculate the theoretical optical characteristics as a function of measurement parameters (e.g., film thickness, line width, etc.), and the parameters are adjusted until the measured and theoretical intensities agree within specified convergence criteria. The initial parameter estimates may be provided in terms of an initial xe2x80x9cseedxe2x80x9d model of the measured structure. Alternatively, the optical model may exist as pre-computed theoretical characteristics as a function of one or more discretized measurement parameters, i.e., a xe2x80x9clibraryxe2x80x9d, that associates collections of parameters with theoretical optical characteristics. The xe2x80x9cextractedxe2x80x9d structural model has the structural parameters associated with the optical model which best fits the measured characteristics, e.g., in a least-squares sense.
Conrad (U.S. Pat. No. 5,963,329) is an example of the application of scatterometry to measure the line profile or topographical cross-sections. The direct application of Maxwell""s equations to diffracting structures, in contrast to non-diffracting structures (e.g., unpatterned films), is much more complex and time-consuming, possibly resulting in either a considerable time delay between data acquisition and result reporting and/or the need to use a physical model of the profile which is very simple and possibly neglects significant features.
Scheiner et al. (U.S. Pat. No. 6,100,985) teaches a measurement method that is similar to that of Conrad, except that Scheiner""s method uses a simplified, approximate optical model of the diffracting structure that does not involve direct numerical solution of Maxwell""s equations. This avoids the complexity and calculation time of the direct numerical solution. However, the approximations inherent in the simplified model make it inadequate for grating structures that have period and linewidth dimensions comparable to or smaller than the illumination wavelengths.
In an alternative method taught by McNeil et al. (U.S. Pat. No. 5,867,276) the calculation time delay is substantially reduced by storing a multivariate statistical analysis model based on calibration data from a range of model structures. The calibration data may come from the application of Maxwell""s equations to parameterized models of the structure. The statistical analysis, e.g., as taught in chemometrics, is applied to the measured diffraction characteristics and returns estimates of the parameters for the actual structure.
The measurement method taught by McNeil uses diffraction characteristics consisting of spectroscopic intensity data. A similar method can also be used with ellipsometric data, using ellipsometric parameters such as tan "psgr", cos xcex94 in lieu of intensity data. For example, Xinhui Niu in xe2x80x9cSpecular Spectroscopic Scatterometry in DUV Lithography,xe2x80x9d Proc. SPIE, vol. 3677, pp. 159-168, 1999, uses a library approach. The library method can be used to simultaneously measure multiple model parameters (e.g. linewidth, edge slope, film thickness).
In International (PCT) application publication no. WO 99/45340 (KLA-Tencor), Xu et al. disclose a method for measuring the parameters of a diffracting structure on top of laterally homogeneous, non-diffracting films. The disclosed method first constructs a reference database based on a priori information about the refractive index and film thickness of underlying films, e.g., from spectroscopic ellipsometry or reflectometry. The xe2x80x9creference databasexe2x80x9d has xe2x80x9cdiffracted light fingerprintsxe2x80x9d or xe2x80x9csignaturesxe2x80x9d (either diffraction intensities, or alternatively ellipsometric parameters) corresponding to various combinations of grating shape parameters. The grating shape parameters associated with the signature in the reference database that matches the measured signature of the structure are then reported as the grating shape parameters of the structure.
Definition of Terms
An unbounded periodic structure is one that is invariant under a nonzero translation in a direction when there exists a minimum positive invariant translation in the said direction. Here we are concerned with structures that are periodic in directions (substantially) parallel to the surface of a wafer. Here xe2x80x98waferxe2x80x99 is used to mean any manufactured object that is built by building up patterned, overlying layers. Silicon wafers for microelectronics are a good example, and there are many others, e.g., flat panel displays.
A one-dimensional (1D) periodic structure has one direction in which it is invariant for any translation. The lattice dimension is perpendicular to the invariant direction. The smallest distance of translation along the lattice dimension which yields invariance is the pitch of the grating. Two-dimensional gratings are also possible, with two lattice directions and pitches, as is well known. In this application, a periodic structure is understood to be a portion of an unbounded periodic structure. The periodic structure is understood to extend by more than one period along its lattice axes. A grating is a periodic structure. A diffraction grating is a grating used in a manner to interact with waves, in particular light waves. A 1D grating is also referred to as a xe2x80x9cline gratingxe2x80x9d.
Upon reflection by or transmission through a diffraction grating, light propagates in discrete directions called Bragg orders. For a particular Bragg order m, the component of the wavevector along the lattice axis, kxm, differs from the same component of the wavevector of the incident wave by an integer multiple of the lattice wavenumber 2xcfx80/P. For a line grating,                     k                  x          ,          m                    =                                                  2              ⁢              m              ⁢                              xe2x80x83                            ⁢              π                        P                    +                                                    2                ⁢                π                ⁢                                  xe2x80x83                                ⁢                sin                ⁢                                  xe2x80x83                                ⁢                                  θ                  I                                            λ                        ⁢                          xe2x80x83                        ⁢            m                          =        0              ,          ±      1        ,          ±              ,        2        ,        …                        k              z        ,        m            2        =                            (                                    2              ⁢              n              ⁢                              xe2x80x83                            ⁢              π                        λ                    )                2            -              k                  x          ,          m                2            
where xcex and xcex8I are the wavelength and angle of the incident wave in vacuum (or something effectively like vacuum, e.g., air), n is the refractive index of the transparent medium that separates the two gratings. P is the pitch of the grating. The x-axis is the lattice axis and the z-axis is perpendicular to the plane of the wafer. The Bragg orders are referenced by the integer m. The Bragg orders for which kz2 less than 0 are called evanescent, non-propagating, or cut-off. The evanescent Bragg orders have pure imaginary wavenumbers in the z direction. Hence, they exponentially decay as exp(xe2x88x92|Im(kz)|z) as a function of the distance z, measured from the diffraction grating along the z-axis.
The polar angle xcex8 and azimuth xcfx86 are defined as shown in FIG. 3, with respect to the lateral or in-plane directions x and y, and the vertical or out of plane direction z. The figure applies generally to objects that are substantially planar, or locally to curved objects. The orientation of the lateral directions x and y may correspond to physical features on the wafer, e.g. structures 5 deposited or formed on the wafer (substrate), or actually part of the substrate, e.g., a wafer notch.
The spot of an optical instrument is the region on a sample whose optical characteristics are detected by the instrument. The measurement system can translate the location of the spot on the sample, and focus it, as is well known in the art.
The present invention measures the overlay error of layers on a wafer with low-resolution optics. The basic overlay metrology target used in the present invention comprises a pair of overlapping diffraction gratings, i.e., a lower grating on a lower (or earlier formed) layer and an upper (or later formed) grating. The spot of the optical instrument preferably covers many periods of the gratings and it does not necessarily resolve the lines of the grating. The overlay error is measured by scatterometry, the measurement of optical characteristics, such as reflectance or ellipsometric parameters, as functions of one or more independent variables, e.g., wavelength, polar or azimuthal angles of incidence or collection, polarization, or some combination thereof.
It is an object of the present invention to use scatterometry to accurately measure overlay error. It is also an object of the invention that this accurate overlay measurement be obtained even when the profile of the grating lines has been altered or rendered asymmetric by a process such as chemical-mechanical planarization. An instrument meeting these objectives has utility in standard planar/photo-lithographic technology used for microelectronics manufacture, as well as other technologies using multiple patterned layers. This has the advantage that the same measurement hardware used for other optical measurements, e.g., line profiles or film thicknesses, can be used for another critical measurement, that of overlay.
The method includes the steps of laying down a first grating during a first step of manufacturing (making) a planar structure, laying down a second grating during a second manufacturing step so that the second grating substantially overlaps the first grating (laterally, in x and y), then illuminating at least a portion of the region of overlap, detecting radiation that has interacted with both gratings, and inverting for the offset between the gratings as a parameter of a model. The critical dimension (CD) and line profile also may be measured, simultaneously or with additional, similar measuring and data processing steps.
It is another object of the present invention to describe an apparatus for practicing the above method. The apparatus comprises an instrument receiving a sample and including a source of illumination and a detector that detects light which has interacted with the sample. The sample comprises a first grating fabricated at one stage of making a planar structure and characterized by a first pitch, a second grating with a second, possibly substantially identical, pitch that is formed during a second stage such that the second grating substantially overlaps the first grating in the lateral dimensions. The pitches of the gratings and the parameters of the instrument are chosen such that some energy in one or more non-zero orders diffracted by one of the gratings propagates in the sample media between the two gratings and reaches the other grating. The instrument is suitable for also measuring CD and line profile, as well as the overlay measurement mentioned above.
It is understood that xe2x80x98opticalxe2x80x99 means employing one or more wavelengths of electromagnetic radiation in the UV, visible, or infrared portions of the spectrum. It is also understood that each Bragg order has a range of propagation angle and a range of wavelength, given the nature of the instrument, e.g., numerical aperture (NA) and detector or source wavelength resolution.
It is another object of the present invention to measure overlay error with an optical instrument integrated into a process tool. This method and apparatus overcomes the difficulties associated with vibrations caused by the process tool and the limited space available for vibration damping. The apparatus comprises a process tool with at least one process chamber and a sample handler, an optical system in operative communication with the process tool, a computer equipped with an inverse model for interaction of light between two gratings where at least one parameter of the model is a lateral offset between two gratings.
It is another object of the present invention to measure the overlay error by comparing the optical characteristics of grating pairs with substantially different perfect-overlay offsets. This reduces the dependence of the measurements on ancillary properties of the sample. It also reduces the burden on inverse scattering calculations.
It is another aspect of the present invention to increase the range of unambiguous overlay error measurement from overlaying gratings. One approach is to offset symmetric gratings by one fourth of the grating pitch when the overlay error is zero, so that positive and negative overlay errors have the least ambiguity, regardless of the optical system. Another approach to extend the range of unambiguously detectable overlay errors is to make at least one of the gratings in the pair substantially asymmetric, that is to have the unit cell of its pattern asymmetric. Another approach is to combine a scatterometry measurement of offset with an imaging measurement of offset (similar to the prior art, e.g., using box-in-box). A fourth approach is to have grating pairs with different pitches, preferably in a substantially irrational ratio, to measure the same component of overlay error. These four approaches may be used either separately or in combination to extend the range of unambiguously detectable overlay errors.