1. Technical Field
The disclosure relates generally to an integrated circuit (IC) fabrication, and more particularly to rendering a mask as a coarse mask representation.
2. Background Art
Considerable efforts have been made in the field of photolithography to develop and implement methods of mask compensation and verification. For example, one approach implements optical proximity corrections (OPC) based on pixel-based simulations (also known as dense-image simulations). A key aspect of this simulation approach is that the images of the projected mask patterns are calculated on a regular grid, rather than at a sparse set of isolated adjustment (or characterization) points. The regularity of pixel-based imaging enables fast Fourier transform (FFT) based computations (e.g. convolutions). FFT-based convolutions are relatively rapid compared to the classic convolution algorithms used in the earlier sparsely sampled mode of OPC, assuming that the earlier mode requires calculating the partially coherent images on a densely sampled basis. Fast convolutions typically involve both FFTs and inverse FFTs. Since a FFT and an inverse FFT are calculated in almost exactly the same way, the term “FFT” may be used herein to refer to both, unless specifically indicated otherwise. A FFT only allows modest reductions in the execution time when the output is not needed on a fully dense grid, e.g., via the “pruned FFTs”. The above limitation applies both to reducing the density of the output grid and to situations where the density of the input grid needs to be increased.
Nevertheless, pixel-based imaging is still preferable for various reasons. For example, calculation of images on a dense grid enables more robust resist models in simulations, since the full exposure distribution in the neighborhood of each developed edge is available to the models as a complete and physically realistic input. Also, the regularity of pixel-based imaging is better suited to parallel computation than the sparse image sampling. This regularity provides predictability during an algorithm flow that a special-purpose computer infrastructure can optionally exploit to improve the efficiency of CPU utilization, e.g., by reducing memory latency. Another advantage of pixel-based image simulation is that it can take advantage of the bandlimited character of the image. Once the image has been computed on a grid of points with a step-size finer than the Nyquist spacing (e.g., λ/[4*NA]), the intensity at an off-grid point can then be calculated by the sinc interpolation. Here λ is the wavelength used to form a lithographic image of the chip layer, and NA is the numerical aperture of the projection lens used to form the image. The relative coarseness of the Nyquist grid reduces the size of the FFTs that must be used to carry out the multiple SOCS (Sum-Of-Coherent Systems method) convolutions that are needed within each given mask block, making the FFTs for these repeated convolutions relatively fast. In SOCS convolutions, convolutions of the mask with each of multiple SOCS kernels, typically about 10 in number, are squared and summed to obtain a close approximation of a partially coherent image. The term “Nyquist spacing” refers to the well-known property of continuous bandlimited distributions that they can be fully specified by samples that are taken at a spacing of ½ the reciprocal of the bandlimit. In the context of image formation, an amplitude bandlimit is determined by the NA of the lens, and is more specifically NA/λ. The intensity bandlimit is twice the amplitude bandlimit. The Nyquist spacing as thus defined is the reciprocal of the intensity bandlimit. In pixel based simulations, a mask is processed in sections and the section is referred to as a “mask block.”
In this description, the coarse initial grid on which intensity is calculated may be referred to as a Nyquist grid. However, it is appreciated that the step-size of a coarse initial grid may actually be chosen slightly smaller than the Nyquist limit for reasons of numerical stability, for example, to filter out weak residual content from unused SOCS kernels that may be present between the bandlimit of individual SOCS kernels and the bandlimit of the exact image. It should also be noted that while a coarse grid close to the Nyquist limit is very dense by the standards of conventional OPC, it is still very coarse compared to the precision required in a mask and a printed image.
The sizes of the FFTs used in pixel-based imaging (i.e., size of the mask block) are somewhat flexible. One factor that should be considered in making an efficient choice is the extended-range resolution of the projection lens, where an extended-range refers to the ability of a feature to have a weak influence on the image of another feature in the same region of the chip even when the two features are spaced sufficiently far apart to be separately resolved. Since two dimensional (2D) FFT algorithms exhibit a scaling with an area that is slightly worse than linear (by a log factor), one reasonable configuration for a pixel-based imaging carries out the SOCS imaging convolutions on blocks whose sizes are chosen to be only moderately larger than the buffer zones that must be included to accommodate the periodic boundary conditions. Periodic boundary conditions are an automatic consequence of using FFTs, and unless the mask features are truly periodic on the scale of the FFT, there will be distortions in the images of features whose distance from the period border is less than the extended-range of the lens resolution. Such features should preferably be considered to reside in a buffer zone whose image is not acceptably accurate. If the buffer zone is therefore subtracted from all sides of the image of one of these blocks (this buffer zone being chosen to be large enough to adequately encompass the tails of the lens impulse response), the remaining central “tile” of the block (chosen to be of similar dimension to the buffer zone in this configuration) represents the unit into which the patterns are divided. The size of the buffer zone is then chosen to be a moderately large multiple such as 10 or 20 of the conventional lens resolution (i.e., a multiple of the minimum resolvable feature separation), whereas the Nyquist step size will be a moderately small fraction of the minimum resolvable feature separation (no larger than 0.25λ/NA). For efficiency, the size of the central tile of the block that remains after trimming the buffer zone should not be dramatically larger than the buffer zone, otherwise the nonlinear scaling of FFT calculation time will degrade efficiency. For these reasons, the Nyquist limit and the extended-range lens resolution together imply that the FFT size should preferably be fairly large, for example in the range of 1282 to 10242. Because the log-like departure in the scaling of FFT algorithms from pure area-proportionality is fairly weak, the exact choice is not critical from this point of view.
Computer hardware architecture can also influence the choice of FFT size. Chips or other hardware that are specialized for FFT computations may be particularly efficient in handling FFTs of a particular size, such as 1024×1024 (1K2), or may be efficient at handling FFTs in a particular range of sizes.
Two-dimensional FFTs operate on a discrete array of numbers as the input. When an FFT is taken for a portion of a lithographic mask (to use in a pixel-based image simulation), the input array is essentially a bitmap representation of the amplitude transmission of that portion of the lithographic mask.
Ordinarily, each feature on the physical mask is polygonal with right-angle corners (so-called Manhattan polygons). The rare polygons that are nominally non-Manhattan are often fabricated as Manhattan by the mask-writing tools, and in any case non-Manhattan polygons are sufficiently rare in integrated circuits that a method which significantly speeds the lithographic simulation of pure-Manhattan polygons will significantly speed the lithographic simulation of entire chip levels. In the simplest representation of the mask design, each polygon has a specified uniform amplitude transmission, such as 100%. Here amplitude transmission (‘transmission’ for short) refers to the relative amplitude of the electric field transmitted through a particular point on the exit face of the mask.
The transmission is most often a real number, but may also be a complex number, in cases where the phase of the mask transmission is different from 0° or 180°. However, for purposes of discussion, the mask transmission can always be considered to be 100% inside the polygons, and 0% outside, since other transmission cases can be converted to this form by scaling. Note that an FFT uses discrete arrays as inputs, whereas each pixel in the mask bitmap can be considered as representing a small but finite square region of the mask.
The output of an FFT is also a discrete array of numbers. As noted above, the assumption of discrete Fourier orders in the pupil amounts to an assumption that the mask is periodic. The latter assumption can safely be made even with non-periodic masks, e.g., by employing a buffer zone around the actual working region where the OPC or the verification is implemented.
The use of a discrete array of numbers as an input to the FFT amounts to an assumption of discrete impulse apertures on the mask, whereas the true polygonal mask openings are continuous. In effect, the FFT therefore treats each mask polygon as a sampled array of dots (delta-functions) that fill the polygon's footprint. This bed-of-nails representation of polygons would not be particularly accurate (at reasonable sampling densities) if used in isolation. However, if such a delta-function array is convolved with a small square pixel of a size that matches the delta-function spacing, the original continuous polygon can be accurately recovered. More precisely, the original continuous polygons will be recovered if the polygon edges are laid out on the same grid as the dot array (with the polygon grid offset by half a step in each axis). In the pupil domain, such a convolution is equivalent to multiplying the discrete FFT of the delta-function array by a sine cardinal (sinc) filter. As such, through a simple filtering step, the transform of the continuous mask can be accurately calculated with the (relative) efficiency of discrete FFTs. Note that this sinc filter may be folded into the optical SOCS kernels, either as an explicit step, or inherently when these SOCS kernels are calculated. However, it is required that the size of the Manhattan mask polygon across each of its horizontal and vertical dimensions must be an integer multiple of the FFT grid step size for this method to work, which as discussed further herein, will usually not be the case.
Another aspect of the mask behavior that must be taken into account in the state-of-the-art lithographic simulations is the three-dimensional character of the physical mask structure. For example, mask polygons are typically fabricated as polygon-shaped openings in a film of, for example, opaque chrome that has been deposited on a transparent substrate. The film is typically thin, for example only a few 10's of nanometers in thickness, but such a thickness can no longer be considered infinitesimal when the widths of the polygonal openings in the film are as small as, for example, 200 nm. Note that the image of the polygonal features is usually demagnified by a factor of, e.g., 4, when the illuminated mask pattern is projected onto the semiconductor wafer. The imaging theory that underlies conventional algorithms for lithographic simulation was originally developed for masks that could be approximated as ideal transmission screens (“thin masks”). Representation of a lithographic mask by a transmission bitmap would ordinarily reflect this approximation, where 3D mask topography is neglected when a 2D bitmap is used. The impact of finite mask topography on the lithographic image is referred to as “EMF effects”.
It is no longer entirely adequate to neglect mask topography in the state-of-the-art lithography, where EMF effects represent a perturbation on ideal thin-mask behavior that is small but often non-negligible. Deviations of the transmitted field from the ideal thin-mask behavior are largely localized to the edges of features (assuming standard 4× magnified masks). It is generally adequate for purposes of OPC to represent this perturbation as a simple adjustment to the thin mask behavior that is constant (for a particular illuminating polarization). In other words, one may apply a certain adjustment to the thin-mask patterns in order to approximate the EMF effect. This adjustment needs not vary along the edge of an aperture, nor between different aperture edges, as long as the edges represent the same kind of discontinuities in the film stack topography, and they share a common orientation relative to the illumination polarization.
Most often this EMF adjustment is simply a fixed bias, i.e., EMF effects are treated for simulation purposes as a small “outward” offset in the effective positions of edges (where “outward” might refer to a uniform expansion of opaque features). For more accurate calculations, Tirapu-Azpiroz has shown that EMF edge perturbations can be approximated as small strip-like features of essentially fixed transmission (generally a complex transmission) that are assumed for simulation purposes to lie along the aperture boundaries (i.e., along the polygon edges). These boundary layers are artificial polygons of very narrow width that are added to the set of circuit design polygons in order to imitate EMF effects. Since these perturbing strips (known as boundary layers) are considerably narrower than the lens resolution, their width can (in first approximation) be modestly re-adjusted as long as a compensating adjustment is made in the transmission, holding the width-transmission product effectively constant. This must be qualified as “effectively” constant because the width-transmission product is required to include the thin-mask transmission that would otherwise have been present in the strip of mask area that the boundary layer nominally displaces. Typically the width of the boundary layer is only of the order of 10 nanometers or less (in 1× units) when its transmission is assigned a magnitude value close to 1.
Thus, current methods can represent the effect of 3D mask topography, but they do so with either a non-negligible level of approximation (bias model), or with a substantial increase in the number of features (boundary layer model), with the added features having widths that are typically smaller than 0.1λ/NA. The boundary layers are positioned precisely along the edges of the polygonal features, which may also be given a small bias (e.g., biased outwards slightly) as part of the process of accounting for EMF effects.
Unfortunately, the high density of the design gridding used in almost all IC masks poses a computational difficulty for the pixel-based simulation approaches. Circuit and OPC requirements typically dictate that the edges of mask features be defined on a very fine grid, e.g., of order 0.01 times the width of the smallest feature (as a rough rule of thumb). The minimum feature size is typically about 0.35*wavelength/NA, which makes a polygon design grid much finer than the Nyquist grid (coarse grid). A fine grid is also required to accommodate the narrow precisely positioned boundary layers that may be included in the design to account for EMF effects.
It is the precision at which edges are positioned on the mask that poses a significant difficulty for pixel-based imaging. It is completely impractical to calculate the SOCS image on a grid as fine as the design grid, which is of the order of 50 times finer than the Nyquist grid along each axis in a typical case. As a result, the calculated images are inaccurate. The resist and process models used in OPC to calculate the positions of printed feature edges often use as an input the calculated optical image in an extended neighborhood around the printed edge. Error propagation then causes an increased error in the predicted position of the printed edge. Even when this error is within the overall tolerance level allowed for prediction errors, it can still be of concern in an OPC. One reason is that different instances of a particular feature will usually be placed differently against the fixed pixel grid. This will cause the overhang conditions at a polygonal feature's edges to vary from instance to instance, resulting in variations in rendering errors. As a consequence, invalid variations may exist in the OPCs assigned to different occurrences of the feature. Often the tolerance on the variations in the printing of a particular feature is more stringent than the tolerance on the mean OPC for the feature. As a result, it is important that OPC not be sensitive to the exact positioning of a mask feature against any grid (other than the design grid).
Since a calculation on the very fine design grid is completely impractical, present approaches instead attempt to make the rendered mask resemble the true mask as closely as possible. It is known in the art that one may reasonably go so far as to subdivide the Nyquist grid by a more modest factor, such as 4, even when the 50× subdivision that would fully represents the design grid is not practical. By making the sampling 4× finer than the Nyquist, the rendered mask is made to more closely resemble the true mask. This reduces the error that results from ignoring structure on the design-grid level, though it does not eliminate the error. In combination with the known technique of transmission-matched subpixels, this conventional approach achieves an accuracy that is considered marginally acceptable, in the absence of better alternatives. Likewise, while subdivision of the Nyquist grid by 4 times in each axis will significantly increase processing time, such an increase is usually deemed acceptable, in the absence of better alternatives. The process of subdividing the Nyquist grid is known as “subsampling”; the subdivided grid will be referred to herein as the “sub-Nyquist grid”; and the individual elements of the sub-Nyquist grid will be referred to herein as “subpixels.”
The technique of transmission-matched subpixels is also used in the conventional approaches to make the rendered mask resemble the true mask down to the finest scale available (i.e., at a subpixel scale). Specifically, the rendered mask is made to resemble the true mask as closely as possible at the subpixel level by assigning a gray-level transmission to subpixels whose area is partly covered by polygons. This gray-level transmission (also referred to as a gray-level density) is equal to the fraction of the subpixel area that is covered by polygons. The overall transmission of the subpixel is matched to that of the true mask, but this transmission-matching (also referred to as density-matching) cannot be achieved at a finer than subpixel scale. Transmission-matching has the effect of preserving the area of a polygon, in a weighted sense, when the polygon is converted to bitmap subpixels in a transmission map. Here weighting means that when a subpixel is given a grey-level transmission to represent its partial coverage by the polygon, the weighted area of the subpixel is considered to be reduced in proportion to the grey-level transmission value. However, if this weighting is neglected, the area of all subpixels that are at least partly covered by the polygon will be slightly larger than the polygon's true area, an indication that the representation is slightly distorted.
A difficulty of this conventional method is that the region where the rendered mask departs from the true mask is sharply bounded by the subpixel boundaries. This sharp bounding of each gray-level region results in the generation of spurious high frequencies. These high frequencies turn out to cause errors even though they are mostly outside the collection band of the projection lens. Errors arise because the Fourier transform of a mask rendering that has been gridded at the sub-Nyquist sampling rate is mathematically equivalent to a convolution of the true diffracted spectrum with a broadly spaced comb function, thereby superimposing periodic repeats of the spatial frequency spectrum (diffraction pattern). This spectrum will not be bandlimited. The hard edges of polygonal mask features imply an infinite bandwidth, as do the sharp boundaries between adjacent subpixels, which means that neither the spectrum of the sampling grid nor the spectrum of the true mask will be bandlimited. High frequencies in the sampling grid can interfere with (i.e., couple to) high frequencies in the diffraction pattern of the true mask, which produces low frequency artifacts (errors) that are collected by the lens. Gridding at the subsampling/sub-Nyquist rate rather than at the Nyquist rate serves to more fully separate the overlapping virtual repeats of the diffracted spectrum, and therefore to reduce these artifacts. However, gridding at the subsampling/sub-Nyquist rate does not eliminate these artifacts since the very high subsampling needed to fully match the high frequency design grid is impractical. As noted, a subsampling factor of 4× (in each axis) is typical in the conventional approaches. Note that while 4× subsampling achieves an accuracy that is only marginally accurate even in combination with transmission-matching of subpixels, it nonetheless entails a significant penalty in computation time. For example, if the inverse FFTs used to compute each SOCS convolution are 1K×1K, the FFT used to calculate the collected mask spatial frequencies will be 4K×4× after 4× subsampling. This will require about 20× longer calculation time with standard FFT algorithms than each 1K×1K inverse FFT for SOCS.
As such, high frequency suppression (used for treating interfering high frequency) will be incomplete because the density-matching takes place within a spatially bounded subpixel. In addition, complete suppression requires that the rendition extends across the full mask block under consideration (e.g., of size 10242 before subsampling), or at least across a block of subpixels that is significantly larger than the lens resolution, which would entail significant computational overhead, since many neighboring subpixels would need to be adjusted to render the overlapping subpixels.
The high density of the subsampled grid therefore makes a Fourier transformation the most time-consuming step in the state-of-the-art pixel-based image simulation of a mask, despite the speed advantages provided by FFTs.
An alternative proposal for circumventing the unfavorable accuracy/speed tradeoff in direct mask Fourier transformation is to use the traditional sparse-imaging method of corner-based or edge-based table-lookup for manipulating polygonal mask features. One can consider using the same kind of table-lookup algorithm to calculate the limited set of mask spatial frequencies that are captured by the projection lens. Typically, the polygon convolutions that are stored as tables of pre-convolved corners in today's sparse-imaging routines involve about at least one such table for each of the ten or so SOCS kernels used. Unfortunately, the Fourier transforms used in dense imaging might still require a much more substantial memory access overhead, since the Fourier transforms used in dense imaging would typically involve between, e.g., 1282 to 10242 orders, depending on the block size chosen, and most of these would fall within the circular support of the imaging kernels in the case where the pupil fill σ approaches 1. As a consequence, at least one table is required for each of the 1282 to 10242 Fourier orders.
In sum, while the present methods of rendering a mask make a reasonably balanced compromise between accuracy and speed, the final result is only marginally acceptable by either criterion.