The present invention relates to thin film deposition at a single atomic layer precision for manufacturing of semiconductor devices. More particularly, this invention describes a variety of apparatus configurations to enable atomic layer chemical vapor deposition of thin films of various materials on the surface substrate.
The manufacturing of advanced integrated circuits (ICs) the microelectronic industry is accomplished through numerous and repetitive steps of deposition, patterning and etching of thin films on the surface of a silicon wafer. An extremely complex, monolithic and three dimensional structure with complex topography of variety of thin film materials such as semiconductors, insulators and metals is generated in a typical IC fabrication process. The present trend in the ICs, which is going to continue in the foreseeable future, is to increase the wafer size and decrease the critical device dimensions. As an example, the silicon wafer size has progressed in recent years from 150 mm to 200 mm and now to 300 mm and the next wafer size of 400 mm is on the horizon. Simultaneously, the critical device dimension has decreased from 0.35 micron to 0.25 micron to 0.18 micron. Research and development for the future device dimension devices at 0.13 and next to 0.10-micron technologies is being conducted by several leading IC manufacturers. Such steps are necessary to increase the device speed, sophistication, capability and yield. These trends in the IC production technology have placed extremely stringent and divergent demands on the performance of semiconductor manufacturing equipment that deposit, pattern or etch progressively smaller device structures on the surface of a silicon wafer. This in turn translates into extremely precise control of the critical process parameters such as film thickness, morphology, and conformal step coverage over complex topography and uniformity over an increasingly large area wafer surface.
Various well-developed and established technologies for thin film deposition are being practiced in the IC industry at present, the most prominent being chemical vapor deposition (CVVD) and physical vapor deposition (PVD). Both of these techniques, however, are flux dependent. This means that the number of gaseous species impinging per unit area of the wafer surface must be constant. In a conventional CVD process, the gas mixture is sprayed evenly from a larger diameter showerhead, with hundreds of pinholes in it, facing directly opposite the wafer. With increasing wafer diameter, this entails an even larger showerhead with larger number of pinholes with a strict condition that each pinhole must receive equal amount of gas all the time. An even worse situation is encountered when two or more gases, that are spontaneously reactive towards each other, are required to deposit a thin film. In such a case, the operation of a CVD reactor to deposit large area thin films becomes an extremely difficult task.
Moreover, temperature uniformity of the deposition surface plays an extremely crucial role in affecting the rate of film deposition. This factor being rather crucial in CVD as compared to PVD. In a practical example, the wafer temperature must be maintained at +/xe2x88x921 degree C. at 500 degree C. This leads to complex and expensive heater designs and temperature control hardware and ultimately to added cost and complexity. The average rate of film deposition in CVD mode can be tailored over a wide range. The rate of deposition may be as high as 1000 A/min to as low as 100 A/min. However, yet another fundamental shortcoming of CVD being a dynamic process (and PVD also) is extremely low degree of film uniformity below a certain minimum value of thickness, typically below 200 xc3x85 (Angstrom). With complex device topography, this limitation is exacerbated to bring highly non-uniform film deposition.
The PVD process requires a high vacuum apparatus to throw vaporized material in cluster form towards the surface of the wafer. This leads to poor control on film deposition rate, expensive apparatus, and also limitations on the type of materials to be deposited. Also, the PVD being a line of sight process is much less amenable to achieve conformal film deposition over a complex topography. Such fundamental attributes of these prevalent film deposition technologies place severe constraints on the equipment performance, their scale-up and result in to deficiencies in process control that are being increasingly and rapidly felt as the rapid progress towards smaller device dimensions and larger wafer diameter continues.
A variant of CVD called rapid thermal CVD (RTCVD) is being employed recently to achieve precise control on film deposition rate. In a typical RTCVD process, the wafer is rapidly heated or cooled by radiation from switching on and off a large bank of high power lamps to the desired reaction temperature. Simultaneously, the wafer is exposed to reactive gases. The optimum temperature thus achieved for desired time duration acts like a reaction switch. Further process control can be achieved by simultaneously switching the gas flow towards the wafer. This technique, though rapidly emerging, has some serious drawbacks. First, rapid heating and cooling may lead to wafer warping, slip and undesirable film stress. Second, RTCVD is invariably susceptible to complexities arising from undesirable deposition on windows, optical properties of chamber materials, expensive and complex hardware for optics and radiation control. Also required is the chamber construction material that can withstand rapid and repeated thermal shocks under high vacuum.
Atomic layer chemical vapor deposition (ALCVD or merely ALD) is a simple variant of CVD. It was invented in Finland in late 70""s to deposit thin and uniform films of compound semiconductors, such as zinc sulfide. There are several attributes of ALD that make it an extremely attractive and highly desirable technique for its application to microelectronic industry. ALD is a flux independent technique and it is based on the principle of self-limiting surface reaction. It is also relatively temperature insensitive. In a typical ALD sequence two highly reactive gases react to form a solid film and a gaseous reaction by-product is formed. It is carried out in discrete steps as follows.
FIG. 1 is a schematic of a conventional ALD process cycle with two inert gas pulses and two reactive gas pulses. First a reactive gas (A) is pulsed over the wafer 10. The gas molecules saturate the wafer 10 surface by chemically reacting with it to conform to the contours of the surface. This process is called chemisorption. Next an inert gas (P) pulse is sent over the surface that sweeps away excess number of gas molecules that are loosely attached (physiosorbed) to the surface and thus a monolayer of highly reactive species is formed on the wafer 10 surface. Next the second reactive gas (B) is pulsed over the wafer 10 surface. This reacts rapidly with the monolayer of first gas already adsorbed and a desired film is formed with the elimination of the gaseous by-product. Again an inert gas pulse (P) is introduced that sweeps away by-product and an excess of the second type of reactive gas. The wafer 10 surface is thus covered by a monolayer of desired film (AB) that is as thin as a single atomic layer. The surface is left in a reactive state for the complete sequence to start over. The desired film is thickness is built by repeating the complete reaction sequence described above for definite number of times.
There are numerous practical advantages that ALD can offer over the state-of-the-art techniques such as CVD and RTCVD. Being a flux independent techniques ALD is transparent to the wafer size. It means in an ALD reactor a 300 mm wafer can be coated as simply and as precisely as a 150 mm wafer. ALD also considerably simplifies the reactor design. Also being a chemically driven process, it is much less temperature sensitive. ALD usually offers a temperature window that can be as wide as 10-15 degree C. as opposed to a precise, single numerical value as required in CVD. This simplifies the heater design and controls. Due to the surface saturation reaction mechanism of ALD, gas dynamics plays a relatively minor role in the operation of an ALD reactor. All such factors not only ensure tremendous simplification in the design and operation of equipment but also its scalability without much effort. With respect to process parameters, ALD offers an unprecedented level of process control. The film thickness is controlled in a digital fashion at a single atomic level, e.g. xcx9c3 xc3x85/cycle. Also, the ALD process being surface reaction controlled offers complete and ideal step coverage over complex topography of devices all over the wafer. High and spontaneous reactivity of two precursor gases brings extreme complications to the design and operation of a CVD reactor and adversely affects the film uniformity. In an ALD process, high and spontaneous reactivity of precursors is in fact highly desirable and is exploited to its advantage. Furthermore, in an ALD sequence, the reaction is carried to completion. This ensures complete removal of undesirable reaction by-products from the film. The completion of reaction thus leads to films that are purer and contain much smaller number of defects as compared to their CVD counterparts.
The rate of deposition in ALD is almost fixed and is solely dependent upon the speed of completion of a single ALD sequence. For ALD to become acceptable to the microelectronic industry, it must offer competitive throughput. Hence, it is imperative to complete one ALD sequence comprising of four gas pulses in as short time as possible, in practical terms xcx9c1 second. This places an upper limit on the film deposition rate at approx. 100-200 A/min., but with a precision of xcx9c3 A. With continuously decreasing device dimensions, such features in ALD make the process of ALD highly desirable and applicable for several future device generations and for several future larger wafer diameters. An excellent description of the fundamentals and applications of ALD and the progress it has made so far is offered in a review article written by Tuomo Suntola in the Handbook of Crystal Growth, vol. 3, Thin Films and Epitaxy, Part B, (D. T. J. Hurle, editor), published by Elsevier Science B. V. in 1994.
Although in principle, the technique of ALD offers a variety advantages over the industry prevalent techniques such as CVD and PVD, it has not been commercialized so far. A currently available ALD system that is capable of depositing thin films on 50 mmxc3x9750-mm square substrates is mostly being used for early process development. As described above, ALD is a slower process than CVD or RTCVD with a rate of deposition almost 10 times as slow as the later ones. To overcome this disadvantage, an ALD batch processor system has been developed. In a batch process multiple substrates are coated simultaneously to increase the throughput. However, compared to a single wafer processor, batch processors have a few serious disadvantages such as inadequate process control, poor repeatability within the batch and from batch to batch, backside deposition on the wafer and cross contamination to note a few. Also, both of these ALD systems are based on the principle of transverse gas flow configuration above and across the heated substrate, in which a finite amount of reactive and/or inert gas is pulsed sequentially, as shown in FIG. 2.
FIG. 2 shows a compact ALD reactor 12 with transverse flow configuration in which the wafer 10 lies stationary within a narrow gap in the reactor and gases A, P, and B are pulsed in from one side of the reactor. This type of reactor design has some inherent and serious drawbacks. One drawback is that increasing substrate size requires increasingly longer gas pulse intervals, referred to as pulse widths because the gas has to traverse the full length (or width) of the substrate before the next pulse can be introduced. This increases the cycle time and further adversely affects throughput. It is must be reiterated here that ALD is basically a slower process. Also, such a reactor 12 configuration is inherently susceptible to adverse downstream mixing of reactive gases due to flow instabilities imposed by thermal convection. Moreover, in the transverse gas flow configuration, if the pulse width is shortened the reactive gas can be depleted downstream, leaving the trailing end of the substrate surface without any coating and thus seriously and adversely affecting the ALD process.
Thus, a compact, modular and single wafer Atomic Layer Deposition chamber that is capable of executing an ALD reaction sequence as fast as possible is highly desirable. The gas residence time t (or pulse width), in an ALD reactor is given by the equation:
t=L/vxe2x80x83xe2x80x83(1)
Here, v is the gas velocity and L is path length of the gas in the ALD reactor that is closely correlated to the substrate dimension. This relationship stipulates the shortest possible path length for gas flow. For efficient operation of the ALD reactor the gas residence time above the substrate must be as small as possible. However, the reactive gas during the pulse must completely and uniformly cover a substrate of any suitably large dimension.
A conventional CVD reactor configuration is a parallel plate type. The reactive gases or vapors are uniformly injected, through hundreds of small holes in a plate, that is called shower-head, perpendicularly on to a heated substrate surface that is directly opposite to it. Manifold plates behind the showerhead achieve the difficult task of equally distributing reactive gas mixture to each of the hundreds of holes. However, this invariably increases the gas path length tremendously. Thus a CVD reactor may be used to perform an ALD task in principle; however, in practice it is highly inefficient and thus unsuitable.
It is quite clear from the foregoing description of the advantages and the state-of-art of ALD reactor design that to become rapidly and successfully adaptable to the microelectronic industry, a unique and novel ALD reactor design must be introduced. Such a novel ALD reactor design must have all the following attributes:
(a) Stable fluid flow above the substrate and within the reactor,
(b) No depletion of reactive gas or vapor over the substrate surface,
(c) Shortest Path Length with rapid gas pulsing to enable rapid completion of an ALD cycle,
(d) Smallest internal volume for rapid gas exchange,
(e) Reactor configuration that can be maintained and components and hardware serviced without much difficulty to reduce the shut-down time,
(f) Reliability, compactness and conservative tool foot-print, and
(g) Reproducible and repeatable processing,
What is clearly required is a configuration or configurations of an Atomic Layer Deposition chamber that are unique and innovative to develop a stable fluid flow over the substrate with a minimum path length to cover the complete substrate surface uniformly. The minimum path length, coupled with stability of fluid flow, offers shortest pulse width and satisfies the throughput requirement with high degree of reproducibility.
The present invention provides an atomic layer deposition (ALD) reactor that includes a substantially cylindrical chamber and a substrate mounted within the chamber. The ALD reactor further includes at least one injection tube mounted within the chamber having a plurality of apertures along one side that direct gas emanating from the apertures towards the substrate. While gas is pulsed from the injection tube, either the substrate or the injection tube is continuously rotated in a longitudinal plane within the chamber to ensure complete and uniform coverage of the substrate by the gas.
In a preferred embodiment, the ALD reactor covers a wafer substrate with a gas deposition sequence comprising a first reactive gas (A), an inert gas (P), the second reactive gas (B), and the inert gas (P). In one embodiment of the ALD reactor, the wafer substrate is rotated in a horizontal plane in relation to the injection tube. In a second embodiment, the wafer substrate is stationery within the chamber and the injector tube is rotated in relation to the wafer substrate. In another embodiment, the ALD reactor includes three injection tubes mounted within the chamber in parallel, the first injection tube dispenses gas (A)?, the second injection tube dispenses gas (P)?, and the third injection tube dispenses gas B. In yet other embodiments, the at least one injection tube may be configured in a cross injector tube configuration, a radial gas injector configuration, as stacked circumferential O-rings, or as stacked longitudinal injectors.
Accordingly, the present invention improves the efficiency of an atomic layer chemical vapor deposition apparatus. A combination of relative motion of the substrate with one of the various gas injection configurations achieves complete wafer surface coverage without gas depletion in the shortest possible time frame. The gas injection configurations are highly suitable to realize large area, uniform and highly conformal atomic layer deposition with precise process control.