This invention relates generally to polishing pads used for creating a smooth, flat surface on substrates such as glass, semiconductor device wafers, and/or dielectric/metal composites; more specifically, the composition and methods of the present invention are directed to the polishing surface topography of such pads prior to their use in polishing such substrates. Applications especially adapted for use of the present invention include the polishing/planarization of substrates such as silicon, silicon dioxide, tungsten, and copper encountered in integrated circuit fabrication.
U.S. Pat. No. 5,569,062 describes a cutting means for abrading the surface of a polishing pad during polishing. U.S. Pat. No. 5,081,051 describes an elongated blade having a serrated edge pressing against a pad surface, thereby cutting circumferential grooves into the pad surface.
U.S. Pat. No. 5,990,010 describes a preconditioning mechanism or apparatus for preconditioning a polishing pad. This apparatus is used to generate and re-generate micro-texture during polishing pad use.
In semiconductor wafer polishing processes, initial pre-conditioning of the polishing pad, (also referred to as xe2x80x9cbreak-inxe2x80x9d), is distinguished from the in-process conditioning of a pad that has already undergone pre-conditioning. In-process conditioning can be concurrent with polishing or intermittently performed on a polishing apparatus between polishing cycles. In general, the initial xe2x80x9cstart-upxe2x80x9d period for a polishing pad can be described as the accumulated polish time required for the removal rate of the substrate (or workpiece) material to level off to a stable steady-state removal rate for a particular type of pad. Preconditioning polishing pads addresses the problems associated with the xe2x80x9cstart-upxe2x80x9d period.
In conventional wafer production, chemical-mechanical polishing conditions for subsequent production wafers may be set from the results obtained from the first production wafer. However, a xe2x80x9cfirst wafer effectxe2x80x9d is encountered when a new lot of wafers undergoes polishing on a polishing pad that has been idle for a period of time or when a new (previously unused) polishing pad is installed.
The first wafer effect refers to a difference in the polishing results obtained for the first wafer compared to that obtained for subsequent production wafers. This effect is believed to be due to different polishing conditions encountered by the first wafer. One approach to reduce the first wafer effect is to utilize a blank preconditioning wafer. After preconditioning with such wafers for a certain length of time, the first production wafer is installed in the wafer holder and polished. This on-machine preconditioning procedure is not only cumbersome due to successive loading and unloading of separate cassettes containing preconditioning and production wafers but also leads to increased production costs due to machine downtime associated with preconditioning.
Micro-texture comprises micro-indentations and micro-protrusions. These micro-protrusions typically have a height of less than 50 microns and more preferably less than 10 microns. Micro-indentations have an average depth of less than 50 microns, and more preferably less than 10 microns. Macro-texture comprises both macrogrooves and microgrooves.
Problems associated with in-process conditioning can arise from the need to determine the frequency and duration of conditioning treatment between production polishing runs. This can give rise to further variation and unpredictability due to the variation in surface textures obtained by these techniques. Additionally, in-process conditioning often does not address problems attendant with the Initial break-in period for an as-manufactured polishing pad, for e.g. a pad fabricated of polyurethane.
In the start-up of a polishing process, new pads tend to exhibit a characteristic xe2x80x9cbreak-inxe2x80x9d behavior manifested typically in a low initial rate of removal, followed by a rise in removal rate, and a leveling off to a steady-state on a polishing tool. The break-in period may last from 10 minutes to more than one hour, and represents an increasingly significant equipment efficiency loss in the industry. It has been observed that molded pads which have a smooth surface often exhibit an undesirably long, and/or inconsistent break-in time from pad-to-pad or lot-to-lot of polishing pads. On the other hand, a polishing pad that has been over-conditioned may exhibit an initially high unstable removal rate before leveling off to a steady state value. This deviation also contributes to a longer than desired break-in period.
It would be desirable to provide an as-manufactured polishing pad with a shorter and/or more consistent break-in period, with improved predictability in removal rate and/or an increased steady-state removal rate, as compared to manufactured polishing pads of the present state of the art.
A certain degree of texture is generally required for a polishing pad to perform adequately. This surface texture, consisting of peaks (or protrusions) and valleys (or indentations) often aids polishing in the following ways: 1) the valleys act as reservoirs to hold xe2x80x9cpoolsxe2x80x9d of polishing slurry so that a constant supply of slurry is available for contact with the surface of the substrate being polished; 2) the peaks come in direct contact with the substrate surface causing xe2x80x9ctwo-body abrasive wearxe2x80x9d and/or in conjunction with the slurry particles causing xe2x80x9cthree-body abrasive wearxe2x80x9d; and 3) the texture of the surface acting in conjunction with the shear on the slurry causes eddy currents in the slurry creating wear of the substrate surface by erosion.
It is common practice to use a single number (an xe2x80x9cRaxe2x80x9d number) to characterize surface roughness. Ra describes the average deviation of the pad surface from the average amplitude/height of the surface features. Since two drastically different surfaces could have the same Ra values, additional parameters are necessary to better quantify polishing surface micro-texture. Some additional useful parameters are: Average Peak to Valley Roughness (xe2x80x9cRtmxe2x80x9d); Peak Density (xe2x80x9cRsaxe2x80x9d); Core Roughness Depth (xe2x80x9cRkxe2x80x9d); Reduced Peak Height (xe2x80x9cRpkxe2x80x9d); and Reduced Valley Height (xe2x80x9cRvkxe2x80x9d).
Peak density indicates how may peaks (protrusions) are available to be in contact with the surface of the substrate being polished. For a given downforce on the pad (the pressure with which the substrate is contacted with the polishing layer of the polishing pad) a low peak density would have fewer contact points and thus each contact point would exert greater pressure on the substrate surface. In contrast, a higher peak density would imply numerous contact points with almost uniform pressure being exerted on the substrate surface. Peak density is characterized through the surface area ratio (xe2x80x9cRSAxe2x80x9d) which is defined as [Surface Area/(Normal Areaxe2x88x921)], wherein, surface area is the measured surface area, and normal area is the area projected on a normal plane.
Average Peak to Valley Roughness (xe2x80x9cRtmxe2x80x9d) is a measure of the relative number of peaks and valleys. Peak to valley height characterizes both the height of the peaks and the depth of the valleys in the surface texture. The thickness of the slurry layer (and/or depth of a local pool of slurry) influences the dynamics of slurry and particle flow within the slurry, i.e. whether the flow is laminar or turbulent, the aggressiveness of the turbulence, and the nature of eddy currents. The dynamics of slurry flow is important as it relates to the xe2x80x9cerosion wearxe2x80x9d mechanism of polishing.
Valley size will indicate the ability of the surface to retain xe2x80x9cpoolsxe2x80x9d of slurry as well as the quantity of slurry locally available to perform the polishing. As a relatively large wafer (200 to 300 mm in diameter) passes over a polishing pad it is important to have the slurry available at all points under the wafer to ensure uniformity of polishing. If the polishing pad were featureless it would be difficult for the slurry to penetrate under the wafer to be available in the interior portions of wafer. In this scenario, the contact area between the pad and the wafer becomes xe2x80x9cslurry starvedxe2x80x9d. This is the motivation for polishing pads with grooves or perforations. Macroscopic features such as grooves enable slurry flow between the polishing layer of the polishing pad and the wafer. As we focus on smaller dimensions on a polishing pad, in the range of 0.5-25 mm, (i.e. the land area between grooves or perforations), if the surface of this land area is too smooth (analogous to a featureless pad on a larger size scale), the local area of contact between the pad and wafer can similarly become slurry starved. It is therefore important to have a smaller scale surface texture (i.e, micro-texture) which is capable of locally retaining slurry to make it available on these smaller size scales.
Lastly, in addition to the reasons listed above, peak size is important because it affects the rigidity of that peak; a tall narrow peak will be more flexible than a broader one. The relative rigidity of a peak affects the influence of the abrasive wear component of the polishing. Peak and valley size and shape are cooperatively characterized through Rpk (reduced peak height), Rvk (reduced valley depth), and Rk (core roughness depth). These three values are obtained from the bearing ratio curve, as shown in FIG. 1. The bearing ratio is used in tribological studies. More details may be found in xe2x80x9cTribology: Friction and Wear of Engineering Materials, I. M. Hutchings, page 10, 1992. The relevant text from this textbook is presented here for easy reference: xe2x80x9cThe bearing ratio curve can be understood by imagining a straight line, representing the profile of the surface under investigation. When the plane first touches the surface at a point, the bearing ratio (defined as the ratio of the contact length to the total length of the profile) is zero. As the line is moved further downwards, the length over which it intersects the surface profile increases, and therefore the bearing ratio increases. Finally, as the line reaches the bottom of the deepest valley in the surface profile, the bearing ratio rises to 100%.xe2x80x9d The bearing ratio curve is a plot of bearing ratio against surface height, as shown in FIG. 1.
The present invention provides a polishing pad having a pre-texturized surface (surface micro-texture or microtopography). The micro-texture on the polishing pad according to the present invention is fabricated prior to polishing, preferably during manufacturing, as distinguished from the in-process conditioning methods discussed in prior art. The pad surface is comprised of macro-texture (grooves) and micro-texture mechanically produced-upon the entire pad working surface (also referred to herein as the surface of the polishing layer). The micro-texture is statistically uniform over the entire pad surface and is described by the following quantitative parameters:
Arithmetic Surface Roughness, Ra, from 0.01 xcexcm to 25 xcexcm;
Average Peak to Valley Roughness, Rtm, from 2 xcexcm to 40 xcexcm;
Core roughness depth, Rk, from 1 to 10;
Reduced Peak Height, Rpk, from 0.1 to 5;
Reduced Valley Height, Rvk, from 0.1 to 10; and
Peak density expressed as a surface area ratio, RSA, ([Surf.Area/(Areaxe2x88x921)]), 0.001 to 2.0.
In one embodiment, the present invention provides a homogeneous or non-homogeneous polymeric polishing pad, conditioned prior to use, which generally exhibits a shorter break-in time compared to many prior art as-manufactured polymeric polishing pads.
In another embodiment, the present invention provides an improved break-in time and removal rate relative to many prior art pads.