The well-known “wavefront engineering” approach to improved lithographic performance is based on the following consideration: At a fundamental level, it is often easier to maximize the quality of lithographic images by engineering them in the pupil, rather than the object plane. Put differently, it is often simpler (from a fundamental point of view) to derive an imaging wavefront that is suitable for producing a high quality image, rather than designing the mask that would actually be needed to generate the wavefront which forms the image.
We can identify two reasons for this advantage, one conceptual, the other practical.
First, the finite exit-pupil NA is the basic “bottleneck” that actually limits the resolution of lithographic images. (Resist diffusion has a non-negligible impact, but resist resolution is almost always finer than that of the exposure tool.) Here NA stands for Numerical Aperture, which is defined as the product of two quantities, namely the sine of the half-angular range of the light that is converged to form the image, and the refractive index of the medium in which the image is formed. The highest frequency modulation that the image can contain is given by ½ the NA divided by the wavelength. Many practical challenges must be considered in state-of-the-art lithography, but the core problem is that imposed by the limited lens resolution. In order to manage that core challenge one would like to “push” the most effective wavefront possible through the available NA. (As used herein we will use the term “wavefront” as shorthand for the set of mask spatial frequencies that are actually collected by a projection lens, e.g. a photolithographic lens, considering all illumination directions present in the source.) Thus, it can be advantageous to work in the pupil domain when trying to obtain the best possible image, particularly in the case of small critical cells where intensive optimization is appropriate.
A second advantage of working in the pupil domain is that mask variables are somewhat inflexible to work with, compared to wavefront variables. For example, shape constraints come into play during direct optimization of mask variables that are extraneous to the fundamental issue of maximizing image quality. These constraints involve the basic topology of the mask patterns used, along with issues of feasible mask fabrication (e.g. “when edge A is moved out, it cannot be moved closer than distance d to edge B”). Wavefront variables, on the other hand, are continuously adjustable, without mutual constraint. Wavefront variables are a convenient way to reformulate solutions that are derived from mask patterns whose shapes are costly to fabricate directly, such as gray-level masks formed with multiple transmission levels to produce multi-level images. Wavefront variables have another convenient aspect when periodic boundary conditions are imposed on the object, because in such cases wavefronts can be completely represented by a specific discrete set of diffraction orders, or equivalently by the discrete Fourier transform of these orders, and it is these specific orders that form the image of interest. (Periodic boundary conditions are very frequently imposed in lithographic design simulations, either directly because the object is truly periodic, or indirectly because the numerical simulation code uses a discrete grid in the frequency domain.) In contrast, one may not be able to address all intrinsic degrees of freedom in an image by adjusting the positions of available edges in the mask, except when the mask edges are so heavily fragmented as to produce far more nominal mask variables than there are true degrees of freedom in the image. That outcome is not assured, and even when all orders can in principle be independently addressed, certain orders may only be coupled very weakly to available edges, depending on the topology of the mask design chosen, and this increases the likelihood that extraneous shape constraints will unnecessarily limit the quality of the solution obtained.
Unfortunately, despite its inherent advantages, lithographic design in the pupil plane has one significant disadvantage—The known technology does not provide a practical method for actually realizing the optimal wavefront, i.e. there is no known method for actually constructing a mask using standard photomask technology that will provide a specified wavefront as its diffraction pattern. The issue of practicality is key here—One can, of course, find a mathematically valid mask solution by taking the Fourier transform of the desired wavefront (after choosing some nominally arbitrary [but actually consequential] shape for the uncollected portion of the wavefront); however, this will produce a “mask” that is continuously varying, and so not manufacturable. Producing a specified wavefront with a manufacturable mask is a non-trivial problem.
Manufacturable mask features must take the form of openings in a background film, and these openings must be fairly coarse in size (though they can be smaller [when scaled to “1×”] than the minimum-sized features that can actually be developed in resist; also, the perimeters of mask features can contain fine jogs that are smaller than the smallest mask features). Another limitation is that the transmission of each mask opening is, in the simplest instance, fixed at the transmission level of the substrate. Modern masks allow slightly more flexibility than this, but in general feature transmission should be chosen from one or two allowed values (in addition to the background transmission, which may be nonzero), i.e. masks must generally be binary or trinary in order to meet production-grade feature placement specifications, and to contain fabrication cost. For example, in a so-called Levenson mask, the intensity transmission in any region can only be 0 or 100%, and the transmitted phase can only be 0° or 180°. In general, restriction of the phase shift to 0° or 180° causes the transmission to be real-valued, and the resulting pure-real character of the transmitted wavefront causes critical dimensions in the image to have better stability through focus. For this reason practical mask films conventionally have a transmission phase of either 0° or 180°. So-called grey-level masks whose features have more than two different intensity transmissions generally cannot meet practical feature placement requirements.
Critical features in manufacturable masks must nominally be polygonal, i.e. they must be designed with straight edges (though the limited resolution of mask writing technology will cause significant corner rounding). Also, critical features must usually be “Manhattan”, i.e. their edges can only take right-angle turns, with the edges of all different features being parallel or perpendicular to one another. (However a limited number of features with non-Manhattan edge orientation is sometimes acceptable, such as features with 45° orientation.)
The finite thickness of the patterned mask films poses another practical problem for mask design, since it causes the transmission to locally deviate from its nominal value, particularly in the vicinity of the feature edge. More specifically, the light transmitted through mask apertures will only match the transmission of the mask blank at positions that are somewhat removed from the aperture edge, and likewise the transmission in unopened regions will deviate from the transmission of the background films at positions that are adjacent to aperture edges. The transmission discontinuity arising at the vertical topographic edges of features will therefore not match the nominal discontinuity as defined by the separation between the basic transmission values supported by the mask technology. Such deviations from the nominal behavior are due to the interaction of the Electromagnetic fields with the complex topography of the patterned mask films; these deviations are referred to as “EMF” (for Electro Magnetic Field) effects. Roughly speaking, we can regard EMF effects as being a consequence of the finite thickness of the physical films or trenches that are etched out to form the features that are written on the mask. EMF effects usually become more significant as the film thickness becomes relatively larger in comparison to the feature widths and wavelength. Mask films are very roughly of order 70-100 nm in thickness, and printed features have until recently been larger than the exposing wavelength (which today is typically 193 nm). Since lithographic masks are usually 4× enlarged, it has thus been reasonably accurate to neglect their topography, and treat them as ideal two dimensional (2D) masks (the so-called Thin-Mask Approximation, or “TMA”). Even today, it remains true that the basic lowest order behavior of lithographic masks is generally captured by the TMA approximation. However, while EMF effects can usually be regarded as a perturbation on the TMA behavior, the significance of their impact can be quite substantial in the context of the stringent tolerances of photolithography.
As shown in FIG. 1, the finite thickness of mask topography causes perturbations in the transmitted field. The transmission from points that are appreciably distant from the topographic edge is little changed, but the perturbation can become non-negligible near the perimeter of mask apertures, particularly at the small feature sizes characteristic of modern masks. To lowest order, the in-phase (real valued) transmission change is roughly that produced by a small extension or retraction of the associated edge.
As shown in FIG. 2, the thin mask model is usually able to capture the gross behavior of lithographic images; in this example the printed image size is predicted within ˜11.5%. (Feature size is 50 nm at all plotted periodicities.) The prediction error becomes less than 2% if the absorber edges in the Thin Mask Approximation (TMA) model are extended (biased) from the edge in a way that mimics the topography-induced transmission change.
The projection lens is incapable of resolving the fine structure of the EMF-induced discontinuity in the fields, and it is known (J. Tirapu-Azpiroz and E. Yablonovitch, “Incorporating mask topography edge diffraction in photolithography simulations,” J. Opt. Soc. Am. A 23,4 (2006): p. 821) that EMF effects can be approximately reproduced using a TMA model in which the edge fields are rendered as small strip-like features of essentially fixed transmission (generally a complex transmission) that are assumed for simulation purposes to lie along the aperture boundaries. More precisely, since these perturbing strips (known as boundary layers) are considerably narrower than the lens resolution, their width can (in first approximation) be modestly re-adjusted as long as a compensating adjustment is made in their transmission, holding the width-transmission product effectively constant. (We qualify this as “effectively” constant because we require that the width-transmission product include the thin-mask transmission that would otherwise have been present in the strip of mask-area that the boundary layer displaces.) When the boundary layer is scaled to have a transmission of order unity in magnitude, its width will usually be very roughly of order λ/20, i.e. boundary layers are usually strongly sub-resolution.
Since boundary layers are unresolved, the in-phase part of their image contribution is very similar to that which would be obtained by recessing the aperture edge by a distance that would deliver a matching amplitude contribution (or extending the edge to appropriately occlude the illumination, depending on the sign) in the form of a simple bias.
It is known that the impact on transmitted amplitude EMF effects can to first order approximation be corrected by simple biasing, in order to carry out mask design in the basic mode known as Optical Proximity Correction (“OPC”); see FIGS. 1 and 2. OPC involves adjusting the position of the topographic edges of mask features in such a way that the contour of the printed image falls at a specified position. Essentially, the EMF-induced incremental change in delivered intensity at the feature edge causes a change in the contour position, and the mask aperture must be biased in the opposite direction to undo the shift. In many cases the simple opaque bias model allows the intensity change to be calculated both accurately and rapidly, making OPC correction with topographic masks possible.
However, advanced forms of lithographic optimization that aim to print at the extreme limits of resolution must worry about the process robustness of the printed image, and focus sensitivity is a critical aspect of process robustness. Focus sensitivity is impacted by the phase of the transmitted light, and the in-quadrature component of the vertical edge field perturbation cannot be compensated by a shift in edge position (as shown in FIG. 6). As a result, it is only possible to compensate the degradation in focus robustness that EMF induces in an averaged way when shape adjustment is employed as the compensation method. The in-quadrature (or imaginary) component of the EMF perturbation can therefore be considered more critical than the in-phase (or real) part, and the magnitude of the in-quadrature component is largely a function of the mask topography, which in turn depends on the phase and transmission that are chosen for the mask aperture and background regions.
As shown in FIGS. 3A, 3B and 5, the main impact of the in-quadrature (imaginary valued) component of EMF-induced image changes is a pitch dependent focus shift. The shift of plane of best focus with feature size degrades the common window of the process or “common process window”. The term “Common PW” is short for common process window, and refers to the range of fluctuations in dose and focus over which the fluctuations in a lithographic image remain within tolerance.
As shown in FIGS. 4A-4B, the approximate boundary layer model of EMF effects provides a reasonably accurate calculation of the feature-dependent shifts in focus that are produced by mask topography, with broadly accurate results being obtained down to quite small feature sizes.
In many cases the wavefronts which produce the best-performing images can only be created from masks which have transmitting regions of both 0° and 180° phase, since the availability of both polarities makes it easier to form adjacent bright areas of the image with fields of opposite sign, creating a high contrast dark fringe between the bright features where the field passes through zero amplitude as it changes sign. Such opposite phases can also be produced using the tilt-phase that is generated with off-axis illumination, but this is less flexible than deploying phase-shift on the mask when complex patterns are involved. Unfortunately, topography effects make it hard to maintain the benefits of phase shift imaging as the dimensions of mask features shrink. EMF effects increase as topographic-edge-regions occupy an increasingly large portion of the mask area, and the three-dimensional (3D) topographic step that is present between regions that are phase-shifted tends to be relatively large. As noted above, the field in the vicinity of the step exhibits a phase that is different from the 0° and 180° phases that are attained in the extended open areas on either side of the step. These latter nominal transmittances are pure real (in-phase) even though phase shifters have been employed, but the magnitude of the imaginary (in-quadrature) component that EMF effects induce at vertical topographic edges will tend to be larger with the relatively thick films that phase-shift masks typically employ. This localized quadrature component can cause focus shifts even for opaque binary masks, and in general the miss-phased field will occupy a larger fraction of the transmitted beam when features are small. And as we have seen, this quadrature error also makes it impossible to fully correct the impact of EMF by pure shape adjustment alone.
FIG. 5 shows focal behavior of printed features when a known mask of finite thickness topography is used. Images from TMA masks have a desirable zone of focal stability that is centered at z=0, since the derivative of image intensity with respect to z will be zero at that focus (assuming the usual symmetric source). However, when the thickness of mask topography is non-negligible, one sees from plots like these of feature size vs focus (so-called Bossung curves) that the positions of best focus (center of the regions of focal stability) are shifted away from z=0 in a non-uniform, feature-dependent way.
As shown in FIGS. 3A-3B, biasing cannot correct focal shifts that are caused by the quadrature component of the EMF perturbation. At a fixed focus position, a TMA calculation using a biased mask is incapable of reproducing the true topographic EMF behavior through the full dose range.
The known technology provides only limited means for dealing with these practical difficulties of wavefront engineering. Consider first the limited flexibility that adjustment of conventional mask shapes provides, and the inability of such adjustments to easily address all degrees of freedom in the image. If one is willing to set aside issues of mask manufacturability, there is a known method for optimization of lithographic images that operates in the mask plane, while managing to capture much of the flexibility of wavefront design; this is the method of image optimization using high density bitmap masks, in which every pixel is independently adjustable, and where the pixels are so small as to provide effectively continuous addressability of the mask. Bitmap masks provide the flexibility needed to achieve optimal images, but they contain far more variables than necessary (which severely slows most optimization algorithms). Also, bitmap masks are not practically manufacturable. State-of-the-art mask technology typically requires that isolated mask openings (e.g. bitmap pixels in the case of bitmap masks) be sized larger than perhaps ¼ the width of the smallest feature that can actually be resolved (i.e. printed) in a single wafer image (except scaled up by the lens magnification). The edges of mask features can contain jogs that are much finer than this, but small jog-like serifs do not remove the practical difficulty in fabricating bitmap masks, for the following reason: Since bitmap pixels represent a large number of independent variables, they will be highly redundant, hence many of the pixel adjustments that improve the objective function are likely to be spatially isolated from other pixels of the same polarity as the particular pixel that is actually adjusted at any given step, and the resulting small isolated pixel apertures are not manufacturable.
This lack of contiguity can be circumvented when the problem is linear, but mask optimization problems are inherently quadratic (at best), since the exposing intensity is a quadratic function of diffracted amplitude. Shape constraints can be included in the optimization procedure to inhibit the use of isolated pixels, but then the algorithm becomes bound once again by topological constraints that are irrelevant to the imaging process itself (where the working solution should be able to represent any imaging wavefront that can be propagated through the bandlimiting lens NA), and in addition the working solution can fall into extraneous local minima that involve non-essential topological constraints arising from happenstance clustering. Often these manufacturability requirements are addressed by adding penalty terms to the objective function, but performance is then penalized when the objective is re-weighted to emphasize manufacturability, and in addition the manufacturability requirements are often incompletely satisfied.
Though lithographic design in the pupil plane has been known for many years (e.g. under the rubric of “wavefront engineering”), the above disconnect from mask fabrication has generally restricted wavefront engineering to the role of conceptual aid, rather than full working procedure. One-dimensional patterns are a partial exception to this; known methods for laying out one dimensional (1D) assist features provide a fairly complete link between the desired 1D diffraction patterns and feasible masks. Smith (B. W. Smith, “Mutually Optimizing Resolution Enhancement Techniques: Illumination, APSM, Assist Feature OPC, and Gray Bars”, SPIE v.4346—Optical Microlithography XIV, (2001): p. 471) provides a discussion of pupil-plane optimization and the associated determination of suitable 1D masks.
However, it would be desirable to have a method for producing an arbitrary wavefront within the lens exit pupil, without being restricted to 1D. Such a method could in principle be used to produce any image that a given litho exposure tool is theoretically capable of. This includes images that have been designed using wavefront variables, as well as images which known lithographic methods could only produce using idealized masks whose fabrication would be impractical, such as images from non-manufacturable gray-level masks that employ more than two intensity transmission levels, or images from masks that contain non-manufacturable aperture shapes. Such a method could in addition produce images that are initially designed using impractical idealized mask solutions, and then further refined using wavefront variables. In general, problems of practical mask fabrication would be separated from the core problem of determining the best possible image.
Rosenbluth et al. took an important step towards such a capability with an algorithm described in A. E. Rosenbluth et al., “Optimum Mask and Source Patterns to Print a Given Shape,” JM3, 1, 1 (2002), p. 13. This reference shows how to devise a binary or trinary mask that will reproduce a specified diffraction pattern by solving a single linear programming (LP) problem. Mask features provided by this LP will usually take the form of reasonably large contiguous mask openings, rather than the tiny isolated halftones of bitmap masks. (It should be noted that while the features in the LP solution are usually of practical size, they can also include unrealistically fine “tendrils”, which in the Rosenbluth et al. method are essentially removed by manual intervention.)
However, a drawback to this known method is that the features provided are very far from Manhattan—Feature edges not only have arbitrary orientation, but are actually curved in complex ways. FIG. 7 shows an example, namely a binary mask (transmission=±1) that produces an optimized diffraction pattern for a dynamic random access memory (DRAM) isolation level (see FIG. 8), generated using a known method. Width of cell is about 3λ/NA, height about 1.5λ/NA, with λ=248 nm, NA=0.68. Unfortunately, these curved mask geometries are not manufacturable, due both to lack of Manhattan (or even polygonal) apertures, and the presence of a few overly fine connections between the generally contiguous apertures.
It is possible with some trial and error to semi-manually derive a Manhattan layout from masks produced by this algorithm (e.g. the above paper by Rosenbluth et al. shows a Manhattan mask that is semi-manually derived from the FIG. 7 solution). To do so one draws on the plotted mask a staircased line that approximately follows the perimeter of each mask region. One then reads the coordinates of the staircase corners from the plot, and enters them into an optimization program which attempts to reproduce the desired diffraction orders by adjusting the corner positions. Convergence is very fast if the staircasing is fine, but the masks then become more difficult to fabricate. On the other hand, the corner optimizer typically fails to converge when the staircasing is coarse. Usually one can find an acceptable compromise after a bit of trial and error.
However, this method is far from ideal. First, the final mask features usually contain a large number of difficult-to-fabricate jogs and serifs, i.e. protruding features with aspect ratio of order 1 that have two or more edges with length near the limit of fabricability. Fragments that protrude only slightly from a long edge (i.e. having aspect ratios far from 1) are not a significant concern, nor are near-unit-aspect-ratio structures that are relatively large. A limited number of more difficult jogs (of small but acceptable size, and compact aspect ratio) can be handled, and these jogs can be quite a bit smaller than the minimum allowable isolated mask feature (i.e. it is acceptable to have small jogs that merely adjust the perimeter of a larger, fully resolved feature.)
FIG. 8 shows a DRAM isolation pattern used as an example to explain the present method. Rectangles should be printed as dark. Periodicity of rectangular optical unit cell is 1120 nm in the x direction, 560 nm in the y direction.
Unfortunately, a hand-staircased solution often contains more such jogs than is desirable, and also more jogs than are fundamentally necessary to reproduce the diffraction pattern. Another disadvantage to the hand-staircasing method is simply that it is a manual procedure, and so is time-consuming and prone to error. Also, very similar patterns may be staircased in appreciably different ways if the human engineer involved does not recognize or recall previously handled cases. Ideally this would not matter since all solutions will nominally produce the same image; however in practice this would tend to increase variation in Critical Dimensions (CD's) across the printed chip level.