Advances in the miniaturization of CMOS devices have been the key driving force behind the explosive growth of various network centric computing products such as ASIC high speed microprocessors and memories, low power hand-held computing devices, cable modems and advance multi-media audio and video devices. Smaller CMOS devices also means faster switching times for speedier, better-performing end-user systems.
The process of miniaturization of CMOS device technology involves scaling down of various horizontal and vertical dimensions in the CMOS transistor device structure (see FIG. 1). In particular, the thickness of the ion implanted source/drain junctions of the P or N transistors are scaled down with a corresponding scaled increase in the substrate channel doping. In this manner, constant electric field is maintained in the transistor channel which results in higher speed performance for the scaled down CMOS transistors. As shown in FIG. 1(b), schematically shows a CMOS device according to the present invention for a 0.1 .mu.m CMOS devices, the source/drain ion-implanted junctions are as shallow as 50 nm and with a channel doping concentration as high as 1E18 /cm.sup.3.
For CMOS devices with critical gate dimension below 0.25 .mu.m, shallow junction is not the only requirement. The more important requirement in the Source/ Drain implant shallow junction is the abruptness of the junction dopant profile slope which is in close proximity to the transistor channel. As shown in FIG. 2(a), there is more penetration of source/drain dopant into the transistor channel as the junction profile slope becomes less abrupt. This results in poor V.sub.t threshold voltage roll-off characteristics for subquarter micron CMOS devices. (see for example FIG. 2(b)) Thus for small advanced CMOS devices, it is vital for the source/drain junction profile to be shallow abrupt and with high surface concentration i.e. junction depth below 50 nm, profile abruptness at channel doping concentration of 1E18 /cm.sup.3 is preferably less than 10 nm per order of magnitude change in dopant concentration (more preferably 6 nm per order of magnitude) with the junction surface concentration higher than 1E20 /cm.sup.3.
The formation of source/drain junctions in CMOS devices are commonly done by ion implantation in appropriately masked source/drain regions of silicons with boron (p-type) or arsenic and phosphorous (N-type) dopants. To minimize ion channelling during the ion implantation which will broaden the as-implanted profile, the silicon substrate is usually pre-amorphized with heavy ions such as Ge or Si. Although the preamorphization process helps to sharpen the as-implanted profile and improves the epitaxial silicon regrowth process during subsequent thermal annealing, it also creates extensive crystal damage and excess Si interstitials at the end of range (EOR) of the pre-amorphized ions. During thermal annealing, the presence of these EOR damages and excess Si interstitials greatly enhance (10 to 1000 times) the diffusion of dopants through the Si substrates and results in much deeper S/D (source/drain) junctions and poorer junction profiles. The relatively high diffusivity of small Boron dopants in combination with ion channelling and transient diffusion makes the fabrication of small CMOS p.sup.+ junction particularly difficult and becomes the major hurdle to be overcome for further miniaturization of the CMOS device technology.
There are three basic time regimes for an annealing process such as this--The first is "Adiabatic" annealing, in which the surface is heated to very high temperatures in a very short time, typically less than 100 ns, substantially preventing thermal transport into the material and thereby developing an extremely high temperature on the surface of the silicon (close to or above melting). Some defects can arise from the rapid cooling of the silicon. This form of adiatbatic heating requires localized application of high power pulsed excimer laser radiation and results in the melting and recrystallization of the top thin layer of implanted wafer.
The second regime, "Thermal flux", operates on a time scale of about 1 ms. In this mode, a strong thermal gradient is established through the thickness of the wafer. This can be achieved by rapidly raster scanning a CW laser or high power electron beams across the top surface of the wafers. It is desirable to anneal the entire wafer to prevent lateral thermal diffusion of the dopant at the edge of a single pass and the formation of stress between adjacent regions of the wafer. The total joule energy necessary for this is large, due to convection and black-body (infra-red) losses during the processing. Unfortunately, there is no currently known method for accomplishing this. It is this time regime that the current invention addresses.
The third regime is considered "isothermal", with a time scale of 1 second or more. In this time regime, the thermal energy can be efficiently transported throughout the thickness of the wafer, producing a substantially uniform temperature profile across the thickness of the wafer, and is substantially independent of the method of heating, since thermal conduction dominates. This mode is widely used today in manufacturing processes and is commonly achieved by the illumination of the wafers with bank of high power tungsten halogen lamps.
In early transistor junction formation, the annealing process was performed in a conventional furnace. In this type of process, wafers are kept for long times (few hours or more) at temperatures below the optimum process temperatures. However, it was realized that to limit the dopant diffusion and improve defect anneal characteristics during annealing, it is necessary to reduce the time for a given temperature anneal. Recent developments have led to the process known as Rapid Thermal Annealing (RTA). In this process, regardless of the method of applying heat, the process utilizes a heating ramp (over about a minute) to 1050.degree. C., with a 5 to 30 second hold time and a rapid cooling time.
During this annealing, silicon and dopant atoms are moved to substitutional sites in the crystalline Si lattice, other defects are removed and there is an inadvertent and undesirable diffusion of dopant atoms deeper into the silicon material, driven by the concentration profile of the dopant. The motion of the atoms, both desirable and undesirable have an associated activation energy and the critical control elements are the time and temperature.
There are two known examples of using microwave energy for wafer annealing: European patent application PCT/SE94/00190 to Buchta et al, "Cold Wall Reactor For Heating of Silicon Wafers by Microwave energy" and in a paper "Rapid Thermal Processing with Microwave heating", by S. L. Zhang, R. Buchta and D. Sigurd, Thin Solid Film, V246, p151-157 (1994) uses a microwave generator operating at about 2.45 GHz with a power level of about 1500 W to heat wafers (5" wafers) to about 1050).degree. C. over a period of about 30 seconds or more. Dopant profiles after microwave anneal were not discussed in these publications, although they are expected to be consistent with other rapid thermal anneal (RTA) processing using technology known to the art, since the time-temperature parameters are identical with other RTA processing techniques. There is no teaching in the paper or patent to the formation of dopant concentrations or profiles as taught herein. According to the present invention the substrate is exposed to microwave radiation for less than 1 second.
In another example, very intense microwave radiation was used in a pulse mode, in which a wafer was exposed to a series of pulses of microwave energy, with total power of up to 8 MW. Power density at the wafer surface was about 240 kW/cm.sup.2. and the microwave pulse duration was around 1.1 .mu.s. For the annealing process, up to 2000 pulses were used for a single wafer. Reported results indicate that there was substantial surface damage and melting of the wafer surface. Additionally, the dopant moved a substantial distance into the silicon--up to 300 .mu.m (microns). In contrast, the method of the present invention has junction depths less than 50 nm.