The present invention relates to a pattern inspection method, and a pattern inspection apparatus, for detecting a pattern defect, by comparing patterns which should be essentially the same, and judging non-matching parts as defects. More particularly, the present invention relates to an inspection method and an inspection apparatus for detecting a defect in a pattern having a line part, which is frequently found in a semiconductor wafer, a photomask, a liquid crystal display panel, and so forth, and in which a line extending in the longitudinal or transverse direction appears repetitively and at a fixed pitch.
A fixed pattern is repetitively formed on a semiconductor wafer, a semiconductor memory photomask, a liquid crystal display panel, and so forth. Therefore a pattern defect is detected currently by capturing the optical image of the pattern and comparing neighboring patterns. If no difference is found between the two patterns by comparison, the patterns are judged to be nondefective and if any difference is found, it is judged that a defect exists in one pattern. Since such an apparatus is generally called an appearance inspection apparatus, this term is used here. In the following description, a semiconductor wafer appearance inspection apparatus, which inspects the patterns formed on a semiconductor wafer for a defect, is described as an example. The present invention, however, is not limited to this case, but it is applicable to an appearance inspection apparatus for a semiconductor memory photomask, a liquid crystal display panel, and so forth, and moreover, applicable to any apparatus as long as it has a configuration in which patterns, which should be essentially the same, are compared for defect inspection.
The manufacture of a semiconductor device includes a great number of processes and it is important to detect the occurrence of defects in the final and intermediate processes and to feed back the result to the manufacturing process in order to improve the yield and, therefore, an appearance inspection apparatus is widely used to detect defects. FIG. 1 is a diagram that shows the general configuration of a semiconductor wafer appearance inspection apparatus. As shown in FIG. 1, the semiconductor wafer appearance inspection apparatus comprises an image generation section 1 that generates an image signal, of the surface of a semiconductor wafer, a defect candidate detection section 2 that detects a part that has the possibility of being a defect (defect candidate) by converting the image signal into digital data and comparing identical patterns, and an automatic defect classification (ADC) section 3 that analyzes and classifies defect candidates into two groups, that is, a group of fatal defects (killer defects) that affect the yield and a group of non-killer defects that can be ignored.
The image generation section 1 comprises a stage 18 that holds a semiconductor wafer 19, an optical system 11 that generates the surface image of the semiconductor wafer 19, and a control unit 20. The optical system 11 comprises a light source 12, illuminating lenses 13 and 14 that converge the illuminating light from the light source 12, a beam splitter 15 that reflects illuminating light and passes through image light, an objective lens 16 that irradiates the surface of the semiconductor wafer 19 with illuminating light and at the same time that projects the optical image of the surface of the semiconductor wafer 19, and an image pickup device 17 that converts the projected optical image of the surface of the semiconductor wafer 19 into an electrical image signal. As for the image pickup device 17, a TV camera employing a two-dimensional CCD element or the like can be used, but in most cases, a line sensor, such as a one-dimensional CCD, is used to obtain an image signal with high resolution and images are captured by relatively moving (scanning) the semiconductor wafer 19 using the stage 18. Therefore, if optical images are captured by the line sensor in the course of movement of the semiconductor wafer 19 in the direction of the repetitive array of patterns, the image signal of the same part of the pattern is eventually generated at fixed time intervals. As the configuration of the image generation section 11 is widely known, a description is not given here.
The defect candidate detection section 2 comprises an analog-digital converter (A/D) 21 that converts an image signal output from the image pickup device 17 into multi-valued digital image data and a defect detection processing circuit 22 that processes the digital image data, compares the same parts of the pattern, and detects a defect candidate. The process in the defect candidate detection section 2 will be described later.
The ADC 3 analyzes the digital image data of the part of a defect candidate reported from the defect candidate detection section 2 and classifies the defect candidate as a true defect or not.
Next, the process in the defect detection processing circuit 22 is further described. As described above, a plurality of semiconductor chips (dies) are formed on the semiconductor wafer so as to be regularly arranged. The pattern of each die is identical because the same mask pattern is used. Therefore, the same pattern appears repetitively at a pitch of the die array as shown in FIG. 2A, and a comparison between the same parts of two neighboring dies is made. Such a comparison is called the die-die comparison. If there is no defect, the patterns coincide with each other, but there will be a difference in the comparison results if there exists a defect. Even if there exists a difference, however, it is not possible to determine which one of the two dies is defective by a one-time comparison. Then, the comparison is made twice with the dies on both sides of each die as shown in FIG. 2A, and the part is judged to be non-defective if no difference is found by the two-time comparison and the part is judged to be defective if a difference is found by each comparison of the two-time comparison. This judging method by the two-time comparison is called the double detection. The judging method by the one-time comparison is called the single detection. In either case, both judging methods for identifying a defect, in which a comparison is made between two neighboring patterns, are based on a premise that the occurrence frequency of defects is comparatively small and that there is little possibility of the existence of defects on the same part of patterns to be compared at the same time, and in fact the occurrence frequency of fatal defects in patterns formed on a semiconductor wafer is very low in the manufacturing process and, therefore, such a premise does not lead to any problem.
As described above, the semiconductor wafer 19 is scanned by the optical system 11 that has a line sensor 17, and the image data corresponding to the scan width is generated sequentially as the scanning time elapses. Therefore, when double detection is carried out, the image of die A is delayed by one repetition period and compared with the image of die B sequentially, then similarly, the image of die B is delayed by one repetition period and compared with the image of die C sequentially, and thus the double detection process of die B is completed, as shown in FIG. 2A. Similarly, the double detection process is repeated for dies C, D, . . . , until the process is completed for all dies. Although the process for the first die A is the single detection process, it is effective because die B has been already inspected and a part of die A non-matching with die B can be judged to be a defect, but it is also applicable to compare die A with another die. The image data of the die for which the two-time comparison has been made can be deleted sequentially, and if it is designed so that the image data of the next die is stored in the part of memory from which the previous data has been deleted, the memory capacity is sufficient as long as it can store data corresponding to image data of a die. In other words, the memory in this case functions as a delay memory that delays image data by one repetition period. It is also applicable to provide a memory with a capacity large enough for the image data of all dies on a semiconductor wafer. In this case, an enormous memory capacity is required, but it is no longer necessary to generate the image data again by scanning the semiconductor wafer for the analysis of the defective part in the ADC 3.
A unit pattern called a cell is arranged repetitively at a fixed period in the memory cell array of a semiconductor memory. In such a part, a comparison between cells can be made, and such a comparison is called the cell-cell-comparison. FIG. 2B shows the cell-cell-comparison and the double detection process is sequentially carried out between two neighboring cells among cells P–S in the same manner as the die-die comparison.
FIG. 3 is a block diagram that shows the configuration of the defect candidate detection section 2. As shown schematically, the defect candidate detection section 2 comprises the A/D 21 that sequentially outputs digitized multi-valued (gray level) image data 100, a one-period delay memory 23 that delays gray level data by an amount of time corresponding to one repetition period (die array pitch in the case of the die-die comparison, and cell array pitch in the case of the cell-cell comparison), a difference calculation section 24 that calculates difference data that is the difference between the gray level data that the A/D 21 outputs and the delayed gray level data that the one-period delay memory 23 outputs, that is, the difference between two pieces of the gray data that correspond to neighboring parts, a judgment section 25 that judges the part to be a defect candidate when the difference data is found to be larger than a threshold value by comparison, and a threshold setting section 26 that sets a threshold value. With the above-mentioned configuration, the single detection can be carried out. The defect candidate detection section 2 shown schematically comprises a one-period delay memory 27 that delays the output of the judgment section 25 by one period and an AND processing section 28 that calculates a logical sum (AND) of the output of the judgment section 25 and the output of the one-period delay memory 27, and the double detection is carried out by these parts.
The double detection process must be performed in synchronization with generation of image data and a very high-speed processing ability is required. Although a circuit to perform such a process can be realized by a configuration in which a delay memory and a comparison circuit are combined, in most cases, it is realized by a configuration containing a pipeline processing data processor and a working memory because it is difficult to align positions for comparison and make the repetition period variable.
It is essential for a semiconductor wafer appearance inspection apparatus to be capable of detecting every part that includes a difference without fail, therefore, it is designed so as to recognize a part, as a defect, at which the difference between the two images exceeds a fixed threshold value, as described above. Therefore, the number of the parts judged to be a defect candidate depends considerably on the specified threshold value. As described above, the part judged to be a defect candidate is reported to the ADC 3 and it is determined whether it is a fatal defect (killer defect) that affects the yield of the semiconductor device. A problem, however, is caused that the time required for analysis increases if the number of the defect candidates increases and the throughput is degraded, because it is difficult to perform the pipeline processing of this part and a considerable time is required for the analysis of one part. Therefore, it is preferable that the double detection circuit 22 detects every killer defect as a defect candidate without fail, but detects as few non-killer defects as possible as a defect candidate.
There is, however, another problem that it is difficult to meet the demand only with the setting of a threshold value because the part at which the difference between two images is large is not always a killer defect. In the metal process of a semiconductor device, for example, the killer defect that users want to detect is a defect such as a short between patterns and it is preferable if a non-killer defect such as a metal grain is separated from a defect candidate. Generally in most cases, however, the difference in gray level caused by a metal grain is by far larger than that caused by a short between patterns. Therefore, if a threshold value is set to a value so that a metal grain is not at all detected as a defect candidate, a short between patterns, which should be primarily detected, is hardly detected. As a result, a threshold value is set to a value so that a short between patterns is detected without fail, and both a short and a metal grain are detected temporarily as a defect candidate, and then the ADC 3 classifies them according to whether it is a killer defect or not.
Conventionally, the ADC 3 was not provided and the classification was performed by a visual inspection of each defect candidate that had been moved again to the stage of a microscope by using an appearance inspection apparatus or another apparatus called a review station, therefore, an enormous amount of time was required for classification when there occurred many metal grains. Recently, the trend is toward automation of classification by providing the ADC 3, but for the automatic classification to be realized, the image of the die that includes the detected defect candidate and at least one of the die images used in the comparison are needed, and it is necessary to obtain these images again, send them to the ADC 3, and detect the defect candidate again before the defect classifying process is performed. If a memory with a capacity large enough to store the image data of all dies on one semiconductor wafer is provided, as described above, it is no longer necessary to obtain image data of the semiconductor wafer again for analysis of defective parts in the ADC 3, but the memory capacity required for the image data of all the dies is tremendously large, resulting in considerable increase in cost.
As described above, the double detection circuit performs the comparison process at a high speed such as, for example, 1G pixel/second, in order to realize a high throughput and, therefore, this part forms a considerable proportion of the cost of the appearance inspection apparatus. In other words, it can be said that the cost of the apparatus and the throughput thereof are in a trade-off relationship, and the processing performance of the double detection circuit (a pipeline processing data processor and a memory) used to be specified, various factors being taken into consideration. As a result, for example, an upper limit is set to the number of defect candidates that can be reported per unit processed image, and when the number of detected defect candidates exceeds the upper limit, it is reported that a large defect exists in the unit processed image. As described above, there is a problem that a tremendous cost and processing time are required in order to send two pieces of image data relating to all of the detected defect candidates to the automatic defect classification (ADC) section and detect again the defect candidates among all of the defect candidates for classification in the ADC.
An example of the defect detection process is described using a case where the image of a semiconductor memory metal wire layer is captured by a bright field microscope. The metal wire layer, which corresponds to the memory array of a semiconductor memory, has a pattern in which a line extending in one direction appears repetitively at a fixed period. Such a part is called a line part here. FIG. 4A is an example of a pattern of the line part of a semiconductor memory metal layer, and FIG. 4B shows a gray level image of the line part captured by a bright field microscope.
As shown in FIG. 4A, metal wires 51 and spaces 52 therebetween are arranged at fixed intervals. Reference number 53 refers to a pattern short, which is a killer defect, reference number 54 refers to a large metal grain, which is a non-killer defect, and reference number 55 refers to a small metal grain, which is also a non-killer defect. As shown in FIG. 4B, the reflectance of the part of the metal wire 51 is high and the gray level is as high as 200, and that of the space 52 is low and the gray level is as low as 30. As the part at which a defect of a pattern short exists always corresponds to the space part, the gray level of the pattern short part is higher than that of the normal reference part with which a comparison is made. In concrete terms, while the gray level of the space part is 30, which is the reference part, that of the pattern short part is 60.
Generally, a metal grain has a sectional structure as shown in FIG. 5, and illuminating light that enters through an objective lens is scattered by the grain, therefore, the gray level becomes low. In other words, the gray level of the grain part is lower than that of the normal metal wire part without a grain, which is the reference part, when compared with each other. In concrete terms, while the gray level of the metal wire part is 200, which is the reference part, that of the large grain part 54 is 60 and that of the small grain part 55 is 150.
In this example, that is, in the conventional judging method in which the absolute value of a difference between images is compared with a single threshold value, it is unlikely that on one hand a pattern short between metal wires is detected, and on the other hand a grain on the metal wire avoids detection. This is because the difference in gray level between the pattern short part and the normal space part is only 30, but that between the small grain part and the normal metal wire part is 50. In other words, the setting of a threshold value will be a big problem.
Therefore, various methods for determining a threshold value have been proposed. For example, in Japanese Unexamined Patent Publication (Kokai) No. 4-107946, a method has been disclosed in which differences in gray level are calculated at multiple parts of a pattern and a threshold value is determined based on the statistics of them. In Japanese Unexamined Patent Publication (Kokai) No. 5-47886, a method has been disclosed in which an approximation of a curve is calculated from the relationship between the gray level difference and the frequency, and a gray level difference at which the value of the approximation of the curve becomes zero is taken as an optimum threshold value. In Japanese Unexamined Patent Publication (Kokai) No. 2002-22421, a method has been disclosed in which an error probability conversion is carried out based on the standard deviation.
Moreover, in Japanese Unexamined Patent Publication (Kokai) No. 2000-171404, a method and an apparatus for inspecting a pattern have been disclosed, in which, by using an elongated filter, an average gray level and a range in which the gray level changes within the filter are calculated, thereby a direction in which the line pattern extends is detected and at the same time each pixel is classified into a group, and an optimum threshold value is set.
As described above, the metal wire layer of a memory cell array of a semiconductor memory element has a line part and because the metal wire line appears repetitively at a very fine pitch in this line part, a defect is more likely to occur and the grain also does. Therefore, a pattern inspection method and a pattern inspection apparatus, having a simplified configuration and a high processing speed, and capable of detecting a killer defect in a line part in a highly sensitive manner without detecting a non-killer defect, are desired, even though the target range is limited only to the line part.