1. Field of the Invention
The present invention relates to a method and apparatus for aligning each of a plurality of processing areas (shot areas, chip patterns) aligned on a substrate to a predetermined reference position and, more particularly, to an alignment method and apparatus suitable for an exposure apparatus used in a lithography process in the manufacture of semiconductor elements and liquid crystal display elements.
2. Related Background Art
In a step-and-repeat exposure apparatus, a step-and-scan exposure apparatus, a wafer prober, a laser repair apparatus, or the like, each of a plurality of chip pattern areas (shot areas) aligned on a substrate must be aligned to a predetermined reference point (e.g., a process point of each apparatus) on a static coordinate system for defining the moving position of the substrate with a very high accuracy. In particular, in the exposure apparatus, when a substrate (a semiconductor wafer, a glass plate, or the like) is aligned to an exposure position of a pattern formed on a mask or a reticle (to be referred to as a reticle hereinafter), a high accuracy of alignment must be stably maintained so as to prevent a decrease in yield caused by production of a defect of a chip in a manufacturing process.
Normally, in a lithography process, 10 or more layers of circuit patterns (reticle patterns) are superposed and exposed on a wafer. In this case, if an accuracy of alignment (superposition) between two each adjacent layers is low, circuit characteristics deteriorate. More specifically, a chip cannot satisfy required characteristics. In the worst case, the chip becomes a defective product, and decreases the yield. Thus, in an exposure process, an alignment mark is provided to each of a plurality of shot areas on the wafer, and the mark position (coordinate value) is detected with reference to a reticle pattern to be superposition-exposed. Thereafter, wafer alignment for aligning one shot area on the wafer to the reticle pattern is performed on the basis of the mark position information.
The wafer alignment can be roughly classified to two methods. One method is a die-by-die (D/D) alignment method for detecting an alignment mark for each shot area on a wafer and performing alignment. The other method is a global alignment method for obtaining a shot alignment rule by detecting alignment marks of some shot areas on a wafer, and performing alignment of the shot areas. In the current state, a device manufacturing line mainly adopts the global alignment method in consideration of throughput. In particular, in the current state, as disclosed in, e.g., U.S. Pat. No. 4,780,617, an enhanced global alignment (EGA) method for specifying a shot alignment rule on a wafer by a statistic technique with a high accuracy is popular.
In the EGA method, the coordinate positions of only a plurality of shot areas (three or more areas are required, normally about 10 to 15 areas) selected as specific shot areas on a single wafer are measured. After the coordinate positions (shot alignment) of all the shot areas on the wafer are calculated from these measurement values using statistic calculation processing (method of least squares), stepping of a wafer stage is uniquely executed according to the calculated shot alignment. The EGA method requires only a shot measurement time, and an averaging effect for random measurement errors can be expected.
The statistic processing method used in the EGA method will be briefly described below. Designed alignment coordinates of m (m is an integer satisfying m.gtoreq.3) specific shot areas (sample shots) on a wafer are represented by (X.sub.n, Y.sub.n) (n=1, 2, . . . , m), and a linear model given by the following equation is assumed for a shift (.DELTA.X.sub.n, .DELTA.Y.sub.n) from the designed alignment coordinates. ##EQU1##
Furthermore, if actual alignment coordinates (measurement values) of the m sample shots are represented by (.DELTA.x.sub.n, .DELTA.y.sub.n), a square sum E of residuals obtained upon application of this model is expressed by: EQU E=.SIGMA.{(.DELTA.x.sub.n -.DELTA.X.sub.n).sup.2 +(.DELTA.y.sub.n -.DELTA.Y.sub.n).sup.2 } (2)
Thus, parameters a, b, c, d, e, and f for minimizing this equation need only be obtained. In the EGA method, the alignment coordinates of all shot areas on a wafer are calculated on the basis of the parameters a to f calculated as described above and the designed alignment coordinates.
As described above, the EGA method processes shot alignment errors on a wafer as linear errors. In other words, the EGA calculation is a linear approximation. For this reason, the EGA method cannot cope with a variation in local alignment errors on a wafer, i.e., non-linear factors. In order to solve this problem, as disclosed in, e.g., U.S. Pat. No. 4,833,621, a so-called block-EGA (B-EGA) method has been proposed. In this method, at least three shot areas present in a local partial block on a wafer are designated as sample shots, and their coordinate positions are measured. Then, the EGA calculation (statistic calculation) is performed using these plurality of coordinate positions, thereby calculating coordinate positions (shot alignment) of all shot areas in the block. The B-EGA method is characterized in that sample shots to be used in the EGA calculation are changed in units of shot areas to be aligned. For example, three or more shot areas are designated as sample shots in the order of areas close to a shot area to be aligned, and the measurement values of the designated sample shots are used. Thus, the variation (non-linearity) in local alignment errors on a wafer can be coped with.
However, in the above-mentioned prior art, when processing for selecting sample shots to be used in the EGA calculation in units of shot areas to be aligned is executed by a computer, it requires a huge calculation amount. Also, it is difficult to optimize selection of sample shots in units of shot areas in a block. Therefore, although the B-EGA method can cope with a variation in local alignment errors (non-linear distortion), but an accuracy of alignment obtained by this method cannot sufficiently satisfy a required. accuracy. Furthermore, in the B-EGA method, since sample shots are changed in units of shot areas, the number of sample shots per wafer is considerably increased, and the processing time per wafer is prolonged, resulting in a decrease in throughput.
When a wafer is placed on a wafer stage via a holder (holding member), if the wafer is largely warped due to, e.g., a heat treatment, the peripheral portion of the wafer is chucked by the holder, but its central portion cannot be checked by the holder and is lifted therefrom. Therefore, shot areas on the wafer suffering from the above-mentioned phenomenon, in particular, shot areas near the central portion of the wafer, are apparently laterally shifted (displaced) in a direction away from the center of the wafer relative to the corresponding shot areas on a wafer whose entire surface is chucked by the holder.
When the B-EGA method is applied to a wafer which suffers from a non-linear distortion caused by the above-mentioned phenomenon, if a lifted portion of the wafer is accurately determined, a decrease in accuracy of alignment can be prevented to some extent in correspondence with the non-linear distortion. However, in this case, the same problems (increases in calculation amount and the number of sample shots, and the like) as described above are posed, and it is difficult to specify the lifted (bulged) portion of the wafer in practice. More specifically, in the B-EGA method, since a plurality of shot areas on a wafer cannot be optimally grouped into blocks, it is difficult to obtain a desired accuracy of alignment even when the B-EGA method is applied.
If, for example, a non-linear approximation using a high-order function is applied to a wafer suffering from a non-linear distortion in place of the linear approximation as in the EGA method, a decrease in accuracy of alignment can be prevented. However, in this case, the number of sample shots must be considerably increased as compared to the EGA method, and the mark measurement time is prolonged, resulting in a decrease in throughput.
A projection exposure apparatus can use various methods as well as the above-mentioned D/D method, EGA method, and B-EGA method. Therefore, in future, a plurality of methods (to be referred to as alignment modes hereinafter) should be selectively used in consideration of their features (merits). Thus, prior to actual exposure, test printing (superposition exposure) is performed on pilot wafers using each of the plurality of alignment modes, and an optimal alignment mode is selected (determined) on the basis of the test printing result (accuracy of superposition). However, this method requires pilot wafers, and test printing and measurement of the accuracy of superposition require much time, resulting in a low throughput of the apparatus.
Furthermore, in the projection exposure apparatus, marks on a wafer are detected by using an alignment sensor, and mark positions are determined by performing waveform processing with respect to the detection signals under predetermined processing conditions. In this case, a desired accuracy of superposition (alignment) cannot be obtained unless the signal processing condition is optimized in units of process wafers in accordance with the material of a wafer, the type of photoresist or underlayer, the formation conditions (e.g., shape and degree of unevenness) of alignment marks, and the like. Conventionally, an operator determines signal processing conditions for each wafer, by trial and error, on the basis of his/her experience. For this reason, it takes much time for optimization, and the load on the operator is heavy.
Especially in the EGA scheme, a desired accuracy of superposition cannot be obtained unless the arrangement (the number and positions) of sample shots is optimized. In order to solve such problems, test printing (superposition exposure) may be performed on pilot wafers in various sample shot arrangements by the EGA scheme, thus obtaining the optimal sample shot arrangement on the basis of the test results. However, this method requires a large number of pilot wafers, and takes much time to achieve optimization.
In addition, assume the accuracy of measurement of the alignment sensor is poor owing to the roughness and the like of a wafer surface. In this case, even if superposition exposure is performed by using the EGA scheme after the sample shot arrangement is optimized in the above-described manner, a desired accuracy of superposition cannot be obtained. That is, a sample shot arrangement determined on the basis of coordinate positions (measurement values) with poor reliability is not always optimized with respect to a wafer (shot arrangement), and hence a desired accuracy of superposition cannot be obtained with an apparently optimized sample shot arrangement.
In order to obtain a desired accuracy of superposition with an alignment sensor exhibiting low repeatability of measurement with respect to wafers, the averaging effect of the EGA scheme must be improved, that is, the averaging effect must be optimized such that the number of sample shots is increased as compared with a normal operation. However, in the above-described optimization method, a sample shot arrangement satisfying a desired accuracy of superposition is simply selected regardless of the repeatability of measurement of an alignment sensor. In some case, therefore, the arrangement is simply optimized by a small number of sample shots regardless of the fact that an alignment sensor with poor repeatability of measurement is used. This operation is equivalent to apparent optimization of a sample shot arrangement. That is, in the conventional method, it is impossible to optimize a sample shot arrangement also in consideration of the repeatability of measurement of an alignment sensor. Even if, therefore, the repeatability of measurement of the alignment sensor deteriorates depending on a wafer, the resultant deterioration in accuracy of superposition cannot be prevented.
Although it is considered that all the wafers of the same lot have substantially the same surface state, the repeatability of measurement of an alignment sensor may change in the use of wafers of different lots. For this reason, test printing needs to be performed in units of lots to optimize a sample shot arrangement or the above-mentioned signal processing condition, resulting in a great increase in operation time and load. Moreover, optimization cannot be performed in consideration of the repeatability of measurement of an alignment sensor without using process wafers. Therefore, process wafers specially used for measurement are required, resulting in a great reduction in yield and throughput.