1. Field of the Invention
The present invention is concerned with a position detecting method and apparatus, and an exposure method and apparatus including the position detecting method and apparatus, respectively. More particularly, the invention relates to a method and apparatus for detecting, by a method of correlation using a predetermined reference signal, the position of an alignment mark provided on a substrate such as a wafer or the like, and to an exposure method and apparatus adopting the position detecting method and apparatus, respectively.
2. Description of Related Art
During the photolithography process used for producing, for example, semiconductor elements, image pick-up elements (e.g., CCD), liquid crystal display elements, membrane magnetic heads or the like, an image of a mask pattern is transferred onto a photoresist coated on a substrate. The photolithography is effected using an exposure apparatus. The currently available exposure systems include a projection exposure apparatus (e.g., stepper) which projects a pattern of a reticle as a mask through a projection optical system onto a photoresist-covered wafer or glass plate as a substrate, a proximity type exposure apparatus which transfers a mask pattern directly onto a substrate so as to expose the wafer or plate with the pattern, and the like.
A semiconductor element, for instance, is produced by forming a plurality of layered circuit patterns on a wafer in a predetermined overlapped relationship between them by using any one of the above-mentioned exposure apparatuses. Once a first circuit pattern is formed on the wafer by exposure, a mask or reticle has to be aligned accurately with the first circuit pattern formed within each shot area on the wafer before exposing the wafer with a second and subsequent circuit patterns. To this end, an alignment mark or wafer mark is provided as a position detecting mark on a wafer in any preceding steps up to the lithography. The exposure apparatus is provided with alignment sensors to detect the position of the alignment mark, thereby detecting a precise position of the circuit pattern within each shot area on the wafer.
The actual process of wafer alignment comprises two steps: rough alignment in which two or three relatively large marks formed on a wafer are roughly detected to detect approximate positions of shot areas, and fine alignment to be done following the rough alignment to accurately detect the positions of marks provided at approximately ten places on the wafer.
Alignment sensors used for the aforementioned rough alignment include, for example, a laser beam scanning sensor adapted to radiate a laser beam to near an alignment mark(s) for scanning the mask therewith and then detect the alignment mark(s) position based on variations in intensity of scattered and diffracted parts of the laser beam, an imaging sensor arranged to radiate a monochromatic light or broad-band light to near an alignment mark, image the mark through a detecting optical system and detect the mark position based on the image signal, etc.
Typically, a wafer is positioned on a wafer holder with a positional error of 100 .mu.m or so by positioning, for example, based on the outer shape or profile thereof. For rough alignment of the wafer, the position of the alignment mark should be detected with an accuracy of a few .mu.m or less. Also, since a wafer may possibly have formed thereon some circuit patterns difficult to distinguish from the alignment mark, detection of the alignment mark position on the wafer should be done with a correct distinction from such circuit patterns.
To meet the above needs, it has been proposed as one approach to the rough alignment that an alignment mark having segments disposed on a wafer at predetermined intervals (periodicity) along a direction of detecting the mark position should be detected by any one of the above-mentioned sensors, and detection signals thus acquired be processed in a predetermined manner to detect the mark position. More particularly, the waveforms of the detection signals are sliced at a predetermined slice level (threshold), and a mid point between the intersections of each waveform with the slice level line is taken as the mark position. However, this conventional slicing method is disadvantageous in that if only low-contrast signals can be obtained from the marks, for example, because the intervals between the marks are insufficient, the S/N ratio is not good enough for a high accuracy of the position detection.
To overcome the above drawbacks, it has recently been proposed to calculate a coefficient of correlation of a detection signal, having a certain periodicity, of marks detected by a sensor of any of the above-mentioned types, with a reference signal (template) having imparted thereto a same periodicity as the certain periodicity of the detection signal, and take as the mark position a position where the coefficient of correlation is a maximum value. This will be called a "method of correlation" herein. Since the method of correlation allows to detect the mark position with a good S/N ratio, even marks disposed at insufficient intervals can be detected with a high accuracy.
The conventional method of correlation will be briefly described herebelow with reference to FIG. 4.
FIG. 4A shows a rough alignment mark RM formed on a wafer and comprising three grid patterns M1, M2 and M3 disposed at intervals (periodicity) D1 and D2. FIG. 4B shows the waveform of a detection signal Sig.A' of the alignment mark RM, detected by the conventional laser beam scanning sensor and acquired into a memory. FIG. 4C shows an example of template T.sub.mc. Similarly to the signal Sig.A', the detection signal of the alignment mark RM, this template T.sub.mc has peaks at three places spaced from each other by addresses corresponding to the intervals D1 and D2.
Assume here that D1=20 .mu.m, D2=30 .mu.m and the horizontal resolution for acquisition of the signal Sig.A is 0.1 .mu.m, by way of example. One address on the memory of the signal Sig.A' is equivalent to the horizontal resolution of 0.1 .mu.m in terms of a length on a wafer. The signal Sig.A' is acquired within a range Rsc as shown in FIG. 4B. This acquisition range Rsc for the signal Sig.A' is a sum of the intervals D1 and D2 plus an accuracy (100 .mu.m or so) of the wafer positioning based on the profile of the wafer. Therefore, the signal Sig.A' includes a following approximate number of data: EQU (100+20+30)/0.1=1,500
Also, the template T.sub.mc has an approximate range in terms of a length on the wafer, which is a sum of the intervals D1 and D2 plus a spreading of one peak (e.g., 10 .mu.m or so). Therefore, a following approximate number of data is included in the template T.sub.mc : EQU (10+20+30)/0.1=600
Thus, for calculation of a coefficient of correlation between the signal Sig.A' and template T.sub.mc, it is necessary to do 600 times of multiplication, corresponding to the number of the data included in the template T.sub.mc, and to sum the products thus obtained (this processing will be referred to as "product-sum calculation" hereafter), because a coefficient of correlation between the signal and template has to be calculated for each positional relation. Further, it is necessary to effect 1,000 times of such product-sum calculation within a range of 100 .mu.m in terms of a length on the wafer, that is, as an address on the wafer. Namely, a total of 600,000 times of the product-sum calculation is required for the above purpose.
As mentioned above, the conventional method of correlation necessitates to calculate a correlation between the entire area (all data) of a signal of a detected mark and a template including as generally many data as the detection signal includes. Therefore, a relatively much time and a high-performance, expensive computing unit are required for the processing of a huge amount of data (data derived from 600,000 times of the product-sum calculation for the case shown in FIGS. 4A to 4C) by the conventional method of correlation.
Also, to complete the above-mentioned 600,000 times of the product-sum calculation, even a remarkably high-speed computer will take a time of 0.1 sec or so, which will be a cause of reducing the throughput of an exposure apparatus particularly used for a mass production of semiconductor devices.