This invention relates to the field of atomic layer deposition (“ALD”), and more particularly to systems and methods for performing ALD with high throughput and low cost.
Thin film deposition is commonly practiced in the fabrication of semiconductor devices and many other useful devices. Well-known techniques of chemical vapor deposition (“CVD”) utilize chemically reactive molecules that react in a reaction chamber to deposit a desired film on a substrate. Molecular precursors useful for CVD applications comprise elemental (atomic) constituents of the film to be deposited and typically additional elements. CVD precursors are volatile molecules that can be practically delivered, in the gas phase, to react at the substrate.
Conventional CVD is practiced in the art by a variety of techniques. Desired thin film properties and cost-effective operational parameters influence the choice of equipment, precursor composition, pressure range, temperature, and other variables. Many different apparatuses and methods have been successfully implemented. Common to most CVD techniques is the application of a well-controlled flux of one or more molecular precursors into the CVD reactor. A substrate is kept at a well-controlled temperature under well-controlled pressure conditions to promote chemical reaction between the molecular precursors concurrent with efficient desorption of byproducts. The chemical reaction is allowed to proceed to deposit the desired thin film with a desired film thickness.
Optimum CVD performance directly correlates with the ability to achieve and sustain steady-state conditions of flux, temperature, and pressure throughout the process, in which unavoidable transients are suppressed or minimized. CVD has provided uniform and conformal coatings with reproducible thickness and exceptional quality.
Nevertheless, as device density increases and device geometry becomes more complicated in integrated circuit devices, the need for thinner films with superior conformal coating properties has approached the limits of conventional CVD techniques and new techniques are needed. An emerging variant of CVD, atomic layer deposition (“ALD”), offers superior thickness control and conformality for advanced thin film deposition.
ALD is practiced by dividing conventional thin-film deposition processes into single atomic-layer deposition steps that are self-terminating and deposit precisely one atomic layer when conducted up to or beyond self-termination exposure times. An atomic layer typically equals about 0.1 molecular monolayer to 0.5 molecular monolayer. The deposition of an atomic layer is the outcome of a chemical reaction between a reactive molecular precursor and the substrate. In each separate ALD reaction-deposition step, the net reaction deposits the desired atomic layer and eliminates the “extra” atoms originally included in the molecular precursor.
In ALD applications, typically two molecular precursors are introduced into the ALD reactor in separate stages. For example, a metal precursor molecule, MLx, comprises a metal element, M (e.g., M=Al, W, Ta, Si, etc.), that is bonded to atomic or molecular ligands, L. The metal precursor reacts with the substrate. This ALD reaction occurs only if the substrate surface is prepared to react directly with the molecular precursor. For example, the substrate surface typically is prepared to include hydrogen-containing ligands, AH, that are reactive with the metal precursor. The gaseous precursor molecule effectively reacts with all the ligands on the substrate surface, resulting in deposition of an atomic layer of the metal: substrate-AH+MLx→substrate-AMLx-1+HL, where HL is a reaction by-product. During the reaction, the initial surface ligands, AH, are consumed, and the surface becomes covered with L ligands, which cannot further react with metal precursor MLx. Therefore, the reaction self-terminates when all the initial AH ligands on the surface are replaced with AMLx-1 species.
The reaction stage is typically followed by an inert-gas purge stage that eliminates the metal precursor from the chamber prior to the separate introduction of the other precursor.
A second molecular precursor then is used to restore the surface reactivity of the substrate towards the metal precursor. This is done, for example, by removing the L ligands and redepositing AH ligands. In this case, the second precursor typically comprises the desired (usually nonmetallic) element A (i.e., O, N, S), and hydrogen (i.e., H2O, NH3, H2S). The reaction, substrate-ML+AHy→substrate-M−AH+HL, (here, for the sake of simplicity, the chemical reactions are not balanced) converts the surface back to being AH-covered. The desired additional element, A, is incorporated into the film and the undesired ligands, L, are eliminated as volatile by-product. Once again, the reaction consumes the reactive sites (this time, the L terminated sites) and self-terminates when the reactive sites on the substrate are entirely depleted. The second molecular precursor then is removed from the deposition chamber by flowing inert purge-gas in a second purge stage.
This sequence of surface reactions and precursor-removal that restores the substrate surface to its initial reactive state is a typical ALD deposition cycle. Restoration of the substrate to its initial condition is a key aspect of ALD. It implies that films can be layered down in equal metered sequences that are all identical in chemical kinetics, deposition per cycle, composition, and thickness. Self-saturating surface reactions make ALD insensitive to transport nonuniformity. This transport nonuniformity may pertain either to the engineering and the limitations of the flow system or could be related to surface topography (i.e., deposition into three dimensional, high aspect ratio structures). Nonuniform flux of chemicals can only result in different completion times at different areas. However, if each of the reactions is allowed to complete on the entire substrate surface, the different completion kinetics bear no penalty. This is because the areas that are first to complete the reaction self-terminate the reaction, while the rest of the area on the surface is able to complete the reaction and self-terminate and essentially catch up.
Efficient practice of ALD requires an apparatus capable of changing the flux of chemicals from MLx to AHy abruptly and fast. Furthermore, the apparatus must be able to carry this sequencing efficiently and reliably for many cycles to facilitate cost-effective coating of many substrates. Typically, an ALD process deposits about 0.1 nm of a film per ALD cycle. A useful and economically feasible cycle time must accommodate a thickness in a range of about from 3 nm to 30 nm for most semiconductor applications, and even thicker films for other applications. Industry throughput standards dictate that substrates be processed in 2 minutes to 3 minutes, which means that ALD cycle times must be in a range of about from 0.6 seconds to 6 seconds. Multiple technical challenges have so far prevented cost-effective implementation of ALD systems and methods for manufacturing of semiconductor devices and other devices.
Generally, an ALD process requires alternating in sequence the flux of chemicals to the substrate. A representative ALD process, as discussed above, requires four different operational stages:
1. MLx reaction;
2. MLx purge;
3. AHy reaction; and
4. AHy purge.
Given the need for short cycle times, chemical delivery systems suitable for use in ALD must be able to alternate incoming molecular precursor flows and purges with sub-second response times. Also, if significant flow nonuniformities exist, these can be overcome through the self-terminating nature of the chemical reactions by increasing the reaction-stage time to the time dictated by areas that are exposed to the smallest flux. Nevertheless, this necessarily degrades throughput since cycle times increase correspondingly.
In order to minimize the time that an ALD reaction needs to reach self-termination, at any given reaction temperature, the flux of chemicals into the ALD reactor must be maximized. In order to maximize the flux of chemicals into the ALD reactor, it is advantageous to introduce the molecular precursors into the ALD reactor with minimum dilution of inert gas and at high pressures. On the other hand, the need to achieve short cycle times requires the rapid removal of these molecular precursors from the ALD reactor. Rapid removal in turn dictates that gas residence time in the ALD reactor be minimized. Gas residence times, τ, are proportional to the volume of the reactor, V, the pressure, P, in the ALD reactor, and the inverse of the flow, Q, τ=VP/Q. Accordingly, lowering pressure (P) in the ALD reactor facilitates low gas residence times and increases the speed of removal (purge) of chemical precursor from the ALD reactor. In contrast, minimizing the ALD reaction time requires maximizing the flux of chemical precursors into the ALD reactor through the use of a high pressure within the ALD reactor. In addition, both gas residence time and chemical usage efficiency are inversely proportional to the flow. Thus, while lowering flow will increase efficiency, it will also increase gas residence time.
Existing ALD apparatuses have struggled with the trade-off between the need to shorten reaction times and improve chemical utilization efficiency, and on the other hand, the need to minimize purge-gas residence and chemical removal times. Certain ALD systems of the prior art contain chemical delivery manifolds using synchronized actuation of multiple valves. In such systems, satisfactory elimination of flow excursions is impossible because valve actuation with perfect synchronization is itself practically impossible. As a result, the inevitable flow excursions are notorious for generating backflow of gas that leads to adverse chemical mixing.
Thus, a need exists for an ALD apparatus that can achieve short reaction times and good chemical utilization efficiency, and that can minimize purge-gas residence and chemical removal times, while preventing backflow.
As a conventional ALD apparatus is utilized, “memory” effects tend to reduce the efficiency of the ALD reactor. Such memory effects are caused by the tendency of chemicals to adsorb on the walls of the ALD reactor and consequentially release from the walls of the ALD reactor on a time scale that is dictated by the adsorption energy and the temperature of the walls. This phenomenon tends to increase the residence time of trace amounts of chemicals in the ALD reactor. As a result, memory effects tend to increase the purge-time required for removal of chemicals. Thus, a need exists for an ALD apparatus that minimizes memory effects.
Films grow on all areas of conventional ALD apparatuses that are exposed to the chemicals. In particular, film growth occurs on exposed chamber walls, as well as on the substrate. Film growth on chamber walls deteriorates performance of the ALD apparatus to the extent that the growth of film produces an increased surface area on the walls of the ALD chamber. The propensity of films to grow on the chamber walls scales with the surface area of the chamber walls. Likewise, increased surface area further extends chamber memory effects. An increase in surface area may result from the growth of inferior porous film deposits. Film growth that results in porous deposits can extend chamber memory by entrapments of chemical molecules inside the pores. Thus, it is essential to the functioning of an ALD apparatus that growth of films and deposits is kept to a minimum, and that any film growth that does occur is controlled to deposit high quality films that effectively cover the walls without an increase of surface area or the growth of porosity. Thus, a further need exists for an ALD apparatus that minimizes film growth and provides for the control of any film growth that is allowed to occur.
A well optimized ALD apparatus and method is designed to maintain adequate minimal coexistence of ALD precursors in the reaction space in which ALD deposition on a substrate occurs. In contrast, adverse coexistence of ALD precursors is practically inevitable in the system space downstream from the ALD reaction space, provided that throughput is not significantly compromised. The adverse coexistence could only be avoided by purging a substantially larger volume, thereby significantly sacrificing throughput of the ALD system. Typically, ALD precursors coexisting in a chamber space tend to produce inferior films. As a result, throughput-optimized ALD systems suffer from the tendency to grow inferior solid deposits in the space immediately downstream from the ALD space. Inferior film growth becomes increasingly worse because the inferior films present increased surface area, which enhances precursor coexistence, thereby aggravating the problem. Since some of the chemicals proximately downstream from the ALD space return back into the ALD reaction space (e.g., by diffusion), ALD performance deteriorates. In addition, inferior deposition of particles on the substrate results. Accordingly, conventional ALD systems operated at peak throughput are doomed to rapid buildup of contamination and rapid degradation of ALD performance.
Since throughput-optimized ALD systems are characterized by precursor-coexistence immediately downstream from the ALD reaction space, maintaining these systems at peak performance over long and cost-effective maintenance cycles dictates that the unavoidable downstream deposition of films be actively controlled for adequate quality and preferred location. Localized precursor abatement downstream from the ALD space would also substantially reduce wear of downstream components such as pumps, valves, and gauges.
Cold and hot traps have been extensively used to remove undesired constituents from downstream effluents, in the sub-atmospheric pressure range, and are well known to those who are skilled in the art. Other techniques have also been effective for this purpose, such as plasma abatement apparatuses and residence-time extending traps. Many of these abatement solutions are available in the commercial market as “turn-key” equipment that can be adapted for effective use on a variety of different systems. Typically, these abatement apparatuses implement sacrificial abatement surfaces for effectively trapping reactive constituents either permanently (e.g., by chemical reaction to deposit solid films) or temporarily. A majority of these traps can be adapted, in principle, into the downstream of ALD systems. However, considerations of safety and the need to seamlessly integrate abatement into an optimized ALD system considerably restrict the practical feasibility and cost effectiveness of most abatement techniques.
In principle, safety concerns prohibit chemical abatement of ALD precursors by a cold trap. Implementation of hot traps to facilitate reaction between the ALD precursors requires a considerate design and control of conditions to prevent growth of inferior films. Certain properties of typical ALD precursor combinations make the design of hot trap process conditions specific and difficult to control; for example, the precursors TMA and H2O that are used to deposit Al2O3 ALD films. Since abatement under ALD conditions bears an unacceptable throughput penalty, coexistence of the reactants in the abatement space is a given. Accordingly, it is difficult to avoid growth of inferior Al(OH)3 deposits. Suppression of Al(OH)3 growth to promote growth of high quality Al2O3 deposits require that H2O levels are kept at a very low level. This task is not trivial since the low reactivity of H2O dictates dosage of excessive amounts in a high throughput process. Temperature elevation is limited to below 350° C. to avoid TMA pyrolysis. TMA pyrolysis promotes growth of carbonized and rather inferior alumina deposits.
Likewise, close inspection of other ALD precursor systems reveal that typically AHy-type precursors must be excessively dosed, thereby creating problematic inferior deposits such as oxychlorides and amine salts. Accordingly, it is a typical observation that, unfortunately, ALD precursor combinations can deposit exceptional quality ALD films but, if allowed to react under CVD conditions, under typical exhaust conditions where the concentration of AHy precursor is high, create inferior films. In general, the quality of the CVD deposits improve by elevating the temperature and by maintaining the concentration of AHy precursors at very low levels.
A generalized ALD abatement solution should be suitable for many different types of ALD processes. U.S. Patent Application Publication 2002/0187084 describes a method for removing substances in gases discharged from an ALD reaction process that involves directing excess reactant to sacrificial material maintained at substantially the same reaction conditions as at the substrate. However, if optimal ALD throughput is not to be compromised, conditions in the abatement space must, by definition, deviate from conditions in the ALD space. In particular, while the ALD space is optimized to grow high-quality ALD films, coexistence of ALD precursors in the abatement space could promote deposition of inferior films. Practical capacity of abatement surfaces dictate that either the abatement surface is made of very high porosity element or the abatement volume made of very large volume. Either way, the resulting abatement space will tend to accumulate the non-solid producing ALD precursors, since these precursors are always used in large excess in throughput-optimized ALD processes. For example, H2O precursor used in an ALD process to deposit Al2O3 from TMA and H2O could accumulate in the abatement space to a substantially high partial pressure, promoting deposition of inferior films. This potential accumulation of H2O would be aggravated if the deposition of inferior films became excessive, and diffusion of accumulated H2O back into the reaction space could lead to deteriorated ALD performance. Accordingly, hot traps, such as the one described in U.S. Application Publication 2002/0187084, are not a good choice for ALD abatement unless means are provided to control accumulation of ALD precursors, typically the ones that must be excessively used. It is also essential for a generic abatement solution to provide generic means of abatement capable of generating quality film deposition under a variety of conditions.
In existing CVD, PECVD, and ALD systems, gas entrapment and gas-flow disturbances in a reaction chamber, and resulting gas-flow and gas-pressure nonuniformities at the substrate surface, commonly cause adverse nonuniformities in the thickness and other characteristics of the deposited thin film. In ALD, gas-flow and gas-pressure nonuniformities during chemical dosage do not necessarily cause film nonuniformities, provided that appropriately long dosage times are implemented. However, gas entrapment and gas-flow disturbances often severely and adversely impact the effectiveness of purge steps. For example, the “dead-leg” space associated with the wafer transport channel in the wall of a single wafer processing chamber is a known problem in the art of wafer processing such as CVD, etch, ALD and PVD. In particular, effective ALD purge of this space typically is impossible. The art of single wafer deposition has produced a variety of effective remedies for this problem. For example, U.S. Pat. No. 5,558,717 issued Sep. 24, 1996 to Zhao et al. teaches the advantageous implementation of an annular flow orifice and an annular pumping channel. This annular design requires a relatively wide process-chamber design. In another example, U.S. Pat. No. 6,174,377 issued Jan. 16, 2001 to Doering et al. describes an ALD chamber designed for wafer loading at a low chuck position, while wafer processing is carried out at a high chuck position, leaving the wafer transport channel, and the flow disturbances associated with it, substantially below the wafer level. Both of these prior art solutions and other prior art solutions are not well-suited to resolve the problems associated with substrate transport mechanisms in ALD systems.
Thus, a need exists in chemical deposition processes, particularly in ALD technology, for an apparatus that provides uniform and symmetrical flux of chemicals to substrate surfaces, and provides smooth flow-path structures without dead leg wafer loading cavities.