1. Field of the Invention
The present invention relates to the production of thin films by a gas phase deposition method comprising the steps of pulsing the precursors into a reaction chamber containing a substrate, onto which the thin film is grown. In particular, the invention concerns a method for producing thin films by an ALD (Atomic Layer Deposition) type method.
2. Description of the Related Art
Thin film manufacturing is traditionally divided into Chemical (CVD) and Physical Vapor Deposition (PVD) techniques. A relatively fast film growth is achieved at low substrate temperatures by Molecular Beam Epitaxy (MBE) or sputtering methods, belonging to the PVD group. In these methods, the quality of the film and the controllability of its thickness are, however, insufficient for many present-day microelectronic applications. A method known as Pulsed Laser Deposition (abbreviated PLD) provides for better thickness control at the expense of film growth speed, but several layers are still deposited at once in PLD.
In CVD, chemical reactions taking place on the surface of the substrate are utilized to grow a more uniform film. CVD methods require higher substrate temperatures than the PVD methods, but generally offer superior film quality. In PVD and in conventional CVD, including low-pressure, metal-organic and plasma-enhanced CVD methods, besides other process variables, thin-film growth rate is influenced by the concentrations of the starting material inflows. To achieve a uniform thickness of the layers deposited by these methods, the concentrations and reactivities of the starting materials must hence be carefully kept constant over the whole substrate area.
Atomic Layer Deposition (ALD), formerly known as Atomic Layer Epitaxy (ALE), is a promising CVD-derived method for growing highly uniform thin films onto a substrate. The substrate is placed into a high-vacuum reaction space free of impurities and at least two different volatile precursors are injected in vapor phase alternately and repetitively into the reaction space. The film growth is based on surface reactions that take place on the surface of the substrate to form a solid-state layer of atoms or molecules, because the precursors and the temperature of the substrate are chosen such that the alternately-injected vapor-phase precursor's molecules react only on the substrate with its surface layer. The precursors are injected in sufficiently high doses for the surface to be fully saturated during each injection cycle. Therefore, the process is highly self-regulating, being not dependent on the concentration of the starting materials, whereby it is possible to achieve extremely high film uniformity and a thickness accuracy of a single atomic or molecular layer.
The principles of ALD type processes have been presented by the pioneer of the ALD technology, Dr. T. Suntola, e.g. in Handbook of Crystal Growth 3, Thin Films and Epitaxy, Part B: Growth Mechanisms and Dynamics, Chapter 14, “Atomic Layer Epitaxy”, pp. 601-663, Elsevier Science B.V. 1994, the disclosure of which is incorporated herein by reference. The ALD method is described in more detail in FI Patents Nos. 52,359 and 57,975 and in U.S. Pat. Nos. 4,058,430 and 4,389,973, in which also some apparatus embodiments suited to implement this method are disclosed. The contents of these documents are herewith incorporated by reference. Various ALD reactor constructions for growing thin films can also to be found in the following publications: Material Science Reports 4(7) (1989), p. 261, and Tyhjiötekniikka (Finnish publication for vacuum techniques), ISBN 951-794-422-5, pp. 253-261, the contents of which are herewith incorporated by reference.
The ALD method can be used for growing both elemental and compound thin films. Of the elemental films, the most common ones are the silicon films, which are widely used in the microelectronic component industry. Typical compound films are, for example, ZnS, Al2O3 and SiO2 films needed in a variety of electronic applications. New precursors are continuously being developed for enabling manufacture of more complex and specialized films.
Growing a film using the ALD method is a slow process due to its step-wise (layer-by-layer) nature. At least two precursor pulses are needed to form one layer of the desired compound, and the pulses have to be kept separated from each other for preventing uncontrolled growth of the film and contamination of the ALD reactor. After each pulse, the gaseous reaction products of the thin-film growth process as well as the excess reactants in vapor phase have to be removed from the reaction space. This can be achieved either by pumping down the reaction space or by purging the reaction space with an inactive gas flow between successive pulses. In the latter method, a column of an inactive gas is formed in the conduits between the precursor pulses. The latter method is more widely employed on production scale because of its efficiency and its capability of forming an effective diffusion barrier between the successive pulses. Regularly, the inert purging gas is also used as a carrier gas during precursor pulses, diluting the precursor gas before it is fed into the reaction space. Some ALD reactors employing purging with inert gas are called flow-type or travelling wave reactors. A reactor or this kind is disclosed in FI Patent No. 57,975 and the corresponding U.S. Pat. No. 4,389,973.
There are two general prerequisites to make a travelling wave reactor. First, the system should be operated in the viscous flow regime, and second, after the introduction of the precursor pulses, the flow has to be maintained laminar. The first prerequisite stems from the fact that if the reactor is not being operated in the viscous flow regime, laminar flow does not exist. This gives a lower pressure limit of about 1 mbar for a travelling wave reactor. The laminarity of gas inflow may also become disturbed by a too tight bend in the piping. The second prerequisite sets a higher limit of about 2000 to the Reynolds number RN of the system, RN being defined by formula (1)
                                          R            N                    =                      vd            v                          ,                            (        1        )                            wherein v is the velocity of gas,        d is the diameter of flow channel and ν is the kinematic viscosity of fluid.        
This higher limit for RN is, however, only desirable, not mandatory, because in practice, if the reactor is aerodynamically well designed, it may work sufficiently well also with higher Reynolds numbers. That is, the reactor may, in fact, be driven also in turbulent flow regime or in a laminar-turbulent flow transition zone (2000<RN<4000). Even though the reactor being such designed that the flow is mainly laminar, the Reynolds number is regularly exceeded in a short gas mixing area, where some turbulence may exist. However, it is generally preferred to operate the reactor in the laminar flow regime.
Sufficient substrate exposure and good purging of the reaction space are desirable for a successful ALD process. That is, the pulses should be intense enough for the substrate to be saturated and purging should be efficient enough to remove practically all precursor residues and undesired reaction products from the reactor. With the present travelling-wave reactor technology, the purge times required are relatively long with respect to the precursor exposure times.
In order to accelerate the film growth process, there is a demand for methods that enable shortening of the purge periods and, thus, the pulse intervals.
Along with well-established substrate exposure and purging of the reaction space, one of the most restrictive factors contributing to the process cycle times is a temporal widening of the precursor pulses. Successive pulses have to be kept sufficiently separated, because they are mixed if fed with too frequent intervals due to their finite rise and drop times. A widening of the pulse is the result of three main phenomena: a pressure gradient is formed between the precursor and inert gas flows, diffusion, and gases are adsorbed onto and they are desorbed from the surfaces of the reactor. All these effects cause mixing of the precursor gas and the inert gas, and it generates a need for long purge times to ensure operation under proper ALD conditions. Diffusion is inevitable and it sets the theoretical lower limit for the width of the pulse. The “memory effect” caused by absorption and desorption of gases can be reduced by decreasing the reactor size, i.e., the area of the reactor walls. However, in the traditional travelling-wave pulsing methods the total pressure in the reactor feed line increases at the same time as the precursor partial pressure is increased in the line, which causes the pulses to be widened not only by these factors but also by pressure gradient driven flow.
Models on mass transport in ALD type reactors are based on diffusion and the “memory effect”. These models have neglected the effect generated by the pressure gradient, whereby they have not been able to explain the long purge times needed in real-life reactors.
Prior attempts to increase the efficiency of travelling wave ALD reactors have, in practice, been concentrated on an improvement of the exposure of the substrate and the efficiency of purging. WO 04/083485 discloses a method of using a bi-level purge gas flow rate for decreasing the cycle time of the process. A low inert gas flow rate or pressure is used during pulses and a high flow rate or pressure is used for purging. The switching of the flows can be timed so as to break existing turbulence within the reaction chamber, in order to accelerate exhaustion of the particles from the chamber. By this method, both the substrate exposure time and purging time can be shortened. In WO 03/062490, there is described a more disadvantageous bi-level purge gas system, wherein a part of the inert gas flow is conducted to a conduit which bypasses the reactor during feed of reactant pulses in order to increase the share of the precursor in the vicinity of the substrate. Such a method is very gas-consumptive.
Both publications mentioned above concern methods of intensifying the exposure of the substrate to the precursor flow and minimizing the purge time. However, the technical solutions suggested in the art are far from optimal, because they do not take into account the effect of the flow driven by the pressure gradient on the pulse widening. In addition, the reactor design downstream of the reaction chamber is critical in order to maintain an essentially constant pressure in the reaction space.