1. Field of the Invention
The present invention relates to thermal processing of a substrate, and in particular relates to methods and apparatus for irradiating a substrate in a manner that avoids damaging the substrate.
2. Description of the Prior Art
Laser thermal process (LTP), also known more generally as rapid thermal processing (RTP), is a technique for manufacturing semiconductor devices such as integrated circuits or “ICs”. LTP involves irradiating a substrate with a beam of intense radiation to rapidly bring the substrate surface from a relatively low temperature (e.g., 400° C.) to a relatively high temperature (e.g., 1,200° C.) so that the substrate bulk can pull the temperature down quickly.
One prime application of LTP is for dopant activation of the source/drain regions of transistors formed in a silicon Wafer. The source/drain regions are typically formed by exposing areas of a silicon wafer to an electro-statically accelerated ion beam containing boron, phosphorous or arsenic ions. After implantation, the dopant atoms are largely interstitial, do not form part of the silicon crystal lattice, and are electrically inactive. Activation of these dopant atoms is achieved by raising the substrate temperature high enough and for a period of time long enough for the crystal lattice to incorporate the impurity atoms. The optimum length of time depends on the maximum temperature. However, during the activation thermal cycle, the impurities tend to diffuse throughout the lattice causing the distribution to change from one approximating an ideal step profile during implant to a profile having a shallow exponential fall-off.
By employing higher annealing temperatures and shorter annealing times, it is possible to reduce dopant diffusion and retain the abrupt step-shaped dopant distribution achieved after the implant step. The continuous reduction in transistor feature sizes has lead to a process called Laser Spike Annealing (LSA), which employs a CO2 laser beam formed into a long, thin image that is raster scanned across the wafer. In a typical configuration, a 0.1 mm wide beam is scanned at 100 mm/s over the wafer surface to produce a 1 millisecond dwell time for the annealing cycle. A typical maximum temperature during this annealing cycle might be ˜1350° C. In the 1 millisecond time it takes to bring the wafer surface up to the annealing temperature, only about 100–200 microns of material nearest the upper surface is heated. Consequently, the bulk of the 800 micron thick wafer serves to cool the irradiated surface almost as quickly as it was heated once the laser beam moves past.
A wafer that has been processed to the point where it is ready for the annealing step typically contains a number of thin film structures such as gates, poly-silicon runners and pads, and oxide isolation trenches. These structures may be superimposed. The distribution of these structures varies from region to region across a circuit, depending on the function of a particular area of the circuit. Typically, the reflectivity of the circuit varies depending on the proportion of the various structures present in a given region. This leads to substantial variations in the proportion of the laser beam energy absorbed in any area and thus uneven heating.
In some cases, even a 5° C. variation in the maximum annealing temperature can lead to observable performance issues for the circuits being annealed. This temperature variation might correspond to less than a 0.5% variation in the absorption coefficient of the product wafer surface. A minimum variation in absorption can be achieved by using P-polarized radiation incident on the substrate at or near the Brewster angle. P-polarized radiation incident on an undoped silicon surface is completely absorbed at the Brewster angle. In the case of a patterned wafer, the Brewster angle refers to the angle of minimum or near-minimum reflectivity of P-polarized light from a surface. Strictly speaking, films on the surface of an object such as silicon wafer, or electrically active dopants in the silicon prevent it from having a true Brewster angle. Accordingly, the Brewster angle as used herein for a specular surface formed from a variety of different films stacked on a substrate (as is the case for a product IC wafer) can be thought of as an effective Brewster angle, or the angle at which the reflectivity of P-polarized radiation is at a minimum. This minimum angle typically coincides with or is near the angle of the true Brewster angle for the (bare) substrate.
A further reduction in reflectivity variation can be achieved by using wavelengths that are large compared to the device structures on the wafer. This condition is met with the 10.6 micron CO2 laser. The Brewster angle for bare silicon at 10.6 microns is about 75° from normal incidence.
FIG. 1 is a schematic side view of a substrate 10 with an upper surface 12 with an associated surface normal N12. Substrate 10 includes an outer edge 14 with an associated edge normal N14. Unlike surface normal N12, whose direction is the same for points on the upper surface, the edge normal N14 varies in direction as a function of the polar angle φ (see FIG. 4) in the X-Y plane.
Substrate 10 includes a narrow annular exclusion zone 18 of width WE (FIG. 4) that runs around the upper surface 12 adjacent outer edge 14. Exclusion zone 18 is the region between the substrate edge and the process area 19, which is the portion of the substrate where full yield is expected when producing semiconductor devices such as ICs. Substrate 10 is shown being irradiated with a radiation beam 20 that performs LTP of the substrate by scanning the beam over the upper surface. Radiation beam 20 is incident upon substrate upper surface 12 at a surface incident angle θ with respect to surface normal N12. Surface incident angle θ may be, for example the (effective) Brewster angle for the substrate. The intensity I(θ) of radiation beam 20 at substrate surface 12 is given by I(θ)=I0 Cos(θ), wherein I0 is the baseline radiation intensity measured normal to the radiation beam.
When irradiating the substrate at a high incident angle θ (e.g., ˜75°) with scanned radiation beam 20, substrate edge 14 on the far side FS (relative to the incident direction of the radiation beam) never sees the incident radiation beam, even when the beam moves from position A to position B. However, substrate edge 14 on side NS is prone to exposure by radiation beam 20 when the beam is in position A. Further, radiation beam 20 makes an incident angle ψ with respect to edge normal N14, wherein ψ=90°−θ when φ=0°. Thus, if θ=75°, then ψ=15°. Accordingly, if radiation beam 20 is in position A, the intensity I at the near side substrate edge 14 is approximately 3.73 times greater than the intensity at surface 12. This can raise the near-side substrate edge temperature to a level sufficient to damage (e.g., form fractures 30) in the substrate at the substrate edge.
At first glance, it might appear that this problem is easily solved by simply blocking the portion of radiation beam 20 that strikes wafer edge 14. However, when radiation beam 20 is coherent, as is typically the case for LTP or other irradiative processes requiring a high-power beam, a baffle that blocks a portion of the radiation beam before it reaches substrate edge 14 on near side NS, also diffracts the radiation beam. The diffracted radiation interferes constructively or destructively with the portion of the beam directly incident on the substrate, depending on the position. This causes some portions of the regions on substrate surface 12 to be overexposed while other regions are underexposed. The variation in exposure caused by diffraction can be as high as 20% or more. Thus, any attempt to block the beam from striking the substrate edge results in an unacceptable non-uniformity at the substrate surface that extends well beyond the narrow exclusion zone 18, making this shielding approach an untenable solution.
Simply turning the radiation beam off and on as it approaches and recedes from the wafer edge is also not particularly effective. This is because the wafer edge exclusion band 18 is typically only 3 mm wide, while radiation beam 20 has a typical width of 6 to 10 mm. Thus, a linear scan of substrate surface 12 near edge 14 results in either some of the edge being directly exposed by the beam, or the beam being turned off before all of the desired area of the substrate surface has been exposed.
The '739 patent application solves the edge exposure problem by utilizing an optical system that includes an anamorphic relay, an apodized aperture, and a vignetting edge moving in synchronism with the scan. However, this is a relatively complex and expensive solution to the problem.