Thin film deposition techniques are widely used in the manufacturing of microfeatures to form a coating on a workpiece that closely conforms to the surface topography. In the context of microelectronic components, for example, the size of the individual components in the devices on a wafer is constantly decreasing, and the number of layers in the devices is increasing. As a result, the density of components and the aspect ratios of depressions (e.g., the ratio of the depth to the size of the opening) are increasing. The size of such wafers is also increasing to provide more real estate for forming more dies (i.e., chips) on a single wafer. Many fabricators are currently transitioning from 200 mm to 300 mm workpieces, and even larger workpieces will likely be used in the future. Thin film deposition techniques accordingly strive to produce highly uniform conformal layers that cover the sidewalls, bottoms, and corners in deep depressions that have very small openings.
One widely used thin film deposition technique is chemical vapor deposition (CVD). In a CVD system, one or more precursors that are capable of reacting to form a solid thin film are mixed in a gas or vapor state, and then the precursor mixture is presented to the surface of the workpiece. The surface of the workpiece catalyzes the reaction between the precursors to form a solid thin film at the workpiece surface. A common way to catalyze the reaction at the surface of the workpiece is to heat the workpiece to a temperature that causes the reaction.
Although CVD techniques are useful in many applications, they also have several drawbacks. For example, if the precursors are not highly reactive, then a high workpiece temperature is needed to achieve a reasonable deposition rate. Such high temperatures are not typically desirable because heating the workpiece can be detrimental to the structures and other materials already formed on the workpiece. Implanted or doped materials, for example, can migrate within silicon workpieces at higher temperatures. On the other hand, if more reactive precursors are used so that the workpiece temperature can be lower, then reactions may occur prematurely in the gas phase before reaching the intended surface of the workpiece. This is undesirable because the film quality and uniformity may suffer, and also because it limits the types of precursors that can be used.
Atomic layer deposition (ALD) is another thin film deposition technique. FIGS. 1A and 1B schematically illustrate the basic operation of ALD processes. Referring to FIG. 1A, a layer of gas molecules A coats the surface of a workpiece W. The layer of A molecules is formed by exposing the workpiece W to a precursor gas containing A molecules, and then purging the chamber with a purge gas to remove excess A molecules. This process can form a monolayer of A molecules on the surface of the workpiece W because the A molecules at the surface are held in place during the purge cycle by physical adsorption forces at moderate temperatures or chemisorption forces at higher temperatures. The layer of A molecules is then exposed to another precursor gas containing B molecules. The A molecules react with the B molecules to form an extremely thin layer of solid material C on the workpiece W. The chamber is then purged again with a purge gas to remove excess B molecules.
FIG. 2 illustrates the stages of one cycle for forming a thin solid layer using ALD techniques. A typical cycle includes (a) exposing the workpiece to the first precursor A, (b) purging excess A molecules, (c) exposing the workpiece to the second precursor B, and then (d) purging excess B molecules. The purge process typically comprises introducing a purge gas, which is substantially non-reactive with either precursor, and exhausting the purge gas and excess precursor from the reaction chamber in a pumping step. In actual processing, several cycles are repeated to build a thin film on a workpiece having the desired thickness. For example, each cycle may form a layer having a thickness of approximately 0.5–1.0 Å, and thus it takes approximately 60–120 cycles to form a solid layer having a thickness of approximately 60 Å.
One drawback of ALD processing is that it has a relatively low throughput compared to CVD techniques. For example, ALD processing typically takes several seconds to perform each A-purge-B-purge cycle. This results in a total process time of several minutes to form a single thin layer of only 60 Å. In contrast to ALD processing, CVD techniques only require about one minute to form a 60 Å thick layer. In single-wafer processing chambers, ALD processes can be 500%–2000% longer than corresponding single-wafer CVD processes. The low throughput of existing single-wafer ALD techniques limits the utility of the technology in its current state because ALD may be a bottleneck in the overall manufacturing process.
One promising solution to increase the throughput of ALD processing is processing a plurality of wafers (e.g., 20–250) simultaneously in a batch process. As suggested in International Publication No. WO 02/095807, the entirety of which is incorporated herein by reference, such batch processes typically stack the plurality of wafers in a wafer holder that is positioned in an enclosure of a processing system. To increase the number of wafers that can be treated at one time and concomitantly increase the throughput of the system, the wafers are typically held in a relatively close spaced-apart relationship. Unfortunately, this close spacing between adjacent wafers hinders the flow of gas adjacent the surface of the wafer, particularly adjacent the center of each wafer.
In conventional single-wafer ALD systems, a gas “showerhead” will be spaced in relatively close, parallel proximity with substantially the entirety of the wafer surface. This facilitates thorough, effective purging of the excess precursors A and B. In a batch ALD system, however, gas is typically introduced to flow longitudinally alongside the wafer holder. As a consequence, gas exchange between the wafers takes place, in large part, by gas diffusion rather than a significant flow rate of gas across the wafer surface. To enhance the removal of excess precursor between the wafers, conventional batch ALD processing typically involves introducing a significant quantity of a purge gas to dilute the remaining precursor, then drawing a vacuum on the enclosure to remove the diluted gas. Unfortunately, this addition of excess purge gas and subsequent pump-down can take a relatively long period of time, further reducing the throughput of the batch ALD processing system.