Integrated circuit devices are typically formed on substrates, most commonly on semiconductor substrates, by the sequential deposition and etching of conductive, semiconductive, and insulative film layers. As the deposition layers are sequentially deposited and etched, the uppermost surface of the substrate, i.e., the exposed surface of the uppermost layer on the substrate, develops a successively more topologically rugged surface. This occurs because the height of the uppermost film layer, i.e., the distance between the top surface of that layer and the surface of the underlying substrate, is greatest in regions of the substrate where the least etching has occurred, and least in regions where the greatest etching has occurred.
This non-planar surface presents a problem for the integrated circuit manufacturer. The etching step is typically prepared by placing a resist layer on the exposed surface of the substrate, and then selectively removing portions of the resist to provide the etch pattern on the layer. If the layer is non-planar, photolithographic techniques of patterning the resist layer might not be suitable because the surface of the substrate may be sufficiently non-planar to prevent focusing of the lithography apparatus on the entire layer surface. Therefore, there is a need to periodically planarize the substrate surface to restore a planar layer surface for lithography.
Chemical mechanical polishing or planarizing (CMP) is one accepted method of planarization. This planarization method typically requires that the substrate be mounted in a wafer head, with the surface of the substrate to be polished exposed. The substrate supported by the head is then placed against a rotating polishing pad. The head holding the substrate may also rotate, to provide additional motion between the substrate and the polishing pad surface. Further, a polishing slurry (typically including an abrasive and at least one chemically reactive agent therein, which are selected to enhance the polishing of the topmost film layer of the substrate) is supplied to the pad to provide an abrasive chemical solution at the interface between the pad and the substrate. For polishing of an oxide layer, the slurry is usually composed of silica grit having diameters in the neighborhood of 50 nm. The grit is formed by fuming and is then placed in a basic solution having a pH in the neighborhood of 10.5. The solution is then strongly sheared by blending so that the grit remains in colloidal suspension for long periods. For metal polishing, the grit may be formed from either silica or alumina.
The combination of polishing pad characteristics, the specific slurry mixture, and other polishing parameters can provide specific polishing characteristics. Thus, for any material being polished, the pad and slurry combination is theoretically capable of providing a specified finish and flatness on the polished surface. It must be understood that additional polishing parameters, including the relative speed between the substrate and the pad and the force pressing the substrate against the pad, affect the polishing rate, finish, and flatness. Therefore, for a given material whose desired finish is known, an optimal pad and slurry combination may be selected. Typically, the actual polishing pad and slurry combination selected for a given material is based on a trade off between the polishing rate, which determines in large part the throughput of wafers through the apparatus, and the need to provide a particular desired finish and flatness on the surface of the substrate.
Because the flatness and surface finish of the polished layer are dictated by other processing conditions in subsequent fabrication steps, throughput insofar as it involves polishing rate must often be sacrificed in this trade off. Nonetheless, high throughput is essential in the commercial market since the cost of the polishing equipment must be amortized over the number of wafers being produced. Of course, high throughput must be balanced against the cost and complexity of the machinery being used. Similarly, floor space and operator time required for the operation and maintenance of the polishing equipment incur costs that must be included in the sale price. For all these reasons, a polishing apparatus is needed which has high throughput, is relatively simple and inexpensive, occupies little floor space, and requires minimal operator control and maintenance.
An additional limitation on polishing throughput arises because the pad's surface characteristics change as a function of the polishing usage, and the pad also becomes compressed in the regions where the substrate was pressed against it for polishing. This condition, commonly referred to as “glazing”, causes the polishing surface of the polishing pad to become less abrasive to thereby decrease the polishing rate over time. Glazing thus tends to increase the polishing time necessary to polish any individual substrate. Therefore, the polishing pad surface must be periodically restored, or conditioned, in order to maintain desired polishing conditions and achieve a high throughput of substrates through the polishing apparatus. Pad conditioning typically involves abrading the polishing surface of the pad to both remove any irregularities and to roughen the surface.
Pad conditioning, although it raises the average polishing rates, introduces its own difficulties. If it is manually performed, its consistency is poor and it incurs operator costs and significant downtime of the machinery, both decreasing the cost adjusted throughput. If the pad conditioning is performed by automated machinery, care must be taken to assure that the surface abrading does not also gouge and damage the polishing pad. Furthermore, if the relative motion between the conditioning tool and pad is primarily provided by the pad rotation, the relative velocity and dwell time varies over the radius of the pad, thus introducing a radial non-uniformity into the reconditioned pad.
A further limitation on traditional polishing apparatus throughput arises from the loading and unloading of substrates from the polishing surface. One prior art attempt to increase throughput, as shown by Gill in U.S. Pat. No. 4,141,180, uses multiple polishing surfaces for polishing the substrate to thereby allow optimization of polishing rate and finish with two different pad or slurry combinations. A main polishing surface and a fine polishing surface are provided within the described polishing apparatus at a polishing station. A single polishing head, controlled by a single positioning apparatus, moves a single substrate between the different polishing stations on the apparatus.
Another method of increasing throughput uses a wafer head having a plurality of substrate loading stations therein to simultaneously load a plurality of substrates against a single polishing pad to enable simultaneous polishing of the substrates on the single polishing pad. Although this method would appear to provide substantial throughput increases over the single substrate style of wafer head, several factors militate against the use of such carrier arrangements for planarizing substrates, particularly after deposition layers have been formed thereon. First, the wafer head holding the wafer being polished is complex. To attempt to control the force loading each substrate against the pad, one approach floats the portion of the head holding the wafer. A floating wafer holder necessitates a substantial number of moving parts and pressure lines must be included in the rotating and moving geometry. Additionally, the ability to control the forces pressing each individual substrate against the pad is limited by the floating nature of such a wafer head assembly, and therefore is a compromise between individual control and ease of controlling the general polishing attributes of the multiple substrates. Finally, if any one substrate develops a problem, such as if a substrate cracks, a broken piece of the substrate may come loose and destroy all of the other substrates being polished on the same pad.
Polishing throughput is yet further limited by the requirement that wafers be washed at the end of polishing and sometimes between stages of polishing. Although washing time has been limited in the past by simultaneously washing multiple wafer heads, insofar as the washing requires additional machine time over that required for polishing, system throughput is adversely affected.
Therefore, there is a need for a polishing apparatus which enables optimization of polishing throughput, flatness, and finish while minimizing the risk of contamination or destruction of the substrates.
The high-speed polishing required for a high-throughput polishing apparatus imposes severe restrictions and requirements on the polishing apparatus. The mechanical forces are large, but minute scratches incurred in polishing are fatal to integrated circuits. Hence, the design must control and minimize mechanical aberrations. The environment of CMP processing is harsh so that the machinery must be carefully designed to lengthen lifetime and reduce maintenance. Also, the slurry, when allowed to dry on the wafer or any part of the apparatus, tends to form a hardened layer that becomes very difficult to remove. In general, a high-throughput apparatus needs to be easy to operate, require little operator intervention, be easily serviced for regular or unscheduled maintenance, and not be prone to failure or degradation of its parts.
If a polishing system is to be commercialized, it must be flexible and adaptable to a number of different polishing processes. Different integrated-circuit manufacturers prefer different polishing processes dependent on their overall chip design. Different layers to be planarized require distinctly different polishing processes, and the chip manufacturer may wish to use the same polishing system for two different polishing processes. Rather than designing a polishing system for each polishing process, it is much preferable that a single design be adaptable to the different processes with minimal changes of machinery.