In an exposure apparatus for manufacturing, e.g., semiconductor devices that are increasingly shrinking in their feature sizes, before a reticle pattern is projected onto a wafer by exposure, the wafer and reticle are aligned.
Alignment includes two techniques: pre-alignment and fine alignment. In pre-alignment, a feed shift amount generated when a wafer is loaded from a wafer conveyor apparatus onto a wafer chuck on a stage in a semiconductor exposure apparatus is detected, and the wafer is coarsely aligned within an accuracy with which subsequent fine alignment can be normally processed. In fine alignment, the position of the wafer placed on the wafer chuck on the stage is accurately measured, and the wafer and reticle are precisely aligned such that the alignment error between the wafer and reticle fall within the allowable range. The pre-alignment accuracy is, e.g., about 3 μm. The fine alignment accuracy is, e.g., 80 nm or less for a 64 MDRAM although it changes depending on the requirement for wafer work accuracy.
Pre-alignment requires detection in a very wide range because the wafer feed shift generated when the conveyor apparatus feeds a wafer onto the chuck is detected, as described above. The detection range is generally about 500 μm square. As a method of detecting the X- and Y-coordinates of one mark and performing pre-alignment, pattern matching is often used.
Pattern matching is roughly classified into two techniques. In one technique, a mark image is binarized, the binary image is matched with a predetermined template, and a position at which the binary image and template have the highest correlation is determined as a mark position. In the other technique, the correlation between a mark image that remains a grayscale image and a template having grayscale information is calculated. As the latter method, normalization correlation is often used.
In pre-alignment, the mark to be used must be small although the detection range is very wide. This is because as a pattern other than a semiconductor element is used as a mark, the mark is preferably as small as possible to make the semiconductor element area as large as possible. Hence, the mark is often laid out in a region that is not used as an element, e.g., on a scribing line. The mark size is therefore limited by the scribing line width.
The scribing line width is becoming narrower year by year because of the high efficiency of semiconductor manufacturing and improved work accuracy in recent years. Currently, the scribing line width is as small as 100 μm or less, and accordingly, the mark size is also 60 μm or less.
On the other hand, to manufacture a semiconductor device with high density, a wafer is processed through new processes.
A problem associated with pre-alignment mark detection will be described with reference to FIGS. 6A to 6H. FIG. 6A shows a layout in which a semiconductor element pattern is adjacent outside a cross-shaped mark 100, in which a portion “win” long in the horizontal direction is a signal detection region. FIGS. 6B and 6D show detection signal waveforms, and FIGS. 6C and 6E show the wafer sectional structures corresponding to the signals shown in FIGS. 6B and 6D. FIG. 6F also shows the cross-shaped detection mark 100. FIG. 6G shows the detection signal waveform. FIG. 6H shows the wafer section structure corresponding to FIG. 6G.
FIG. 6E shows the sectional structure of the mark after an ultra low step process. FIG. 6H shows the sectional structure of the mark after a CMP process. In these examples, it is difficult to detect the pre-alignment mark.
In pre-alignment, generally, a mark once formed is continuously used for position detection even in the subsequent processes. However, as layers are deposited on the mark, it gradually becomes hard to observe the mark. In the sectional structure shown in FIG. 6E, since a mark having low reflectivity and small step difference is present in a material having high reflectivity and large step difference, the mark can hardly be detected. In addition, since various layers are deposited on the mark, the image obtained by reading the mark may have low contrast and much noise.
The examples shown in FIGS. 6A to 6H suggest that along with the progress in techniques of manufacturing a semiconductor device with high density, processes that make detection of a pre-alignment mark present in a wide detection range by the conventional pattern matching have emerged and they present problems.
For example, a shown in FIG. 6E, when the mark has a small step differences; although the peripheral pattern has a large step and high reflectivity, an image signal shown in FIG. 6D is obtained. The image signal shown in FIG. 6D is a signal in the region “win” shown in FIG. 6A, which is obtained by sensing the pre-alignment mark 100 irradiated by dark field illumination. The ordinate represents a video signal voltage, and the abscissa represents a coordinate. When the signal is binarized using a predetermined threshold value, the mark disappears because the signal level of the mark portion is low. For this reason, the mark cannot be recognized by template matching.
Even with normalization correlation which is known as a detection method for a grayscale image, it is also difficult to detect a mark in an image with small step difference and low contrast or a noisy image. Especially, the detection rate of normalization correlation tends to be low when the influence of noise is large, or mark defects occur in the wafer process. Additionally, the process time is long because of the complex calculation method.
Various approximation calculations have also been examined to solve the above problem. However, the problem of a low detection rate for a low-contrast image remains unsolved.
Another well-known mark detection method is the vector correlation method (Video Information 1992/2). The vector correlation method can obtain a high detection rate even when the mark image has noise or the mark has a defect. In the vector correlation method, attribute information representing the feature of edge information is extracted together with the edge of the mark. With this correlation calculation method, the extracted feature is compared with a template to detect the mark position.
In the vector correlation method, a high-contrast mark and low-contrast mark cannot be detected using the same parameter in extracting the edge information of the marks. Hence, the edge extraction parameter needs to be tuned.