Moore's law is perhaps one of the best known trends among a wide range of technologies relating to semiconductor integrated circuits. Moore's law describes a trend in computing hardware where the number of transistors that can be placed inexpensively on an integrated circuit doubles approximately every two years. This trend has continued for more than half a century and is expected to continue for at least the next few years. Moore's law has served the industry well and has even been incorporated for decades into the International Technology Roadmap for Semiconductors, known throughout the world as ITRS, for guiding long-term planning and setting targets for research and development.
The cost of making smaller semiconductor integrated circuits and smaller dimensions (nodes) on semiconductor integrated circuits has been increasing dramatically the past few years during the transition from i-Line to KrF to ArF and now to the newly emerging extreme ultraviolet (EUV) photolithography technologies. In view of this, a few industry experts have contended that there is not much farther the semiconductor industry can cost effectively reduce the size of the dimensions on semiconductor integrated circuits in the same time frame as stated by Moore's Law.
However, another mechanism that can be used to improve performance relates to the packaging of the integrated circuits. Once a wafer of integrated circuits is completed and diced, the integrated circuit needs to be packaged to be of use. FIG. 1 (PRIOR ART) is a diagram illustrating how the packaging of integrated circuits has evolved over the years from wire bound 102, flip chip 104, stacked die 106, package-on-package 108 to three-dimensional integrated circuit 110 (3D IC 110).
The three-dimensional integrated circuit 110 (3D IC 110) is a semiconductor circuit in which two or more layers of active electronic components are integrated both vertically and horizontally into a single circuit. 3D IC packaging should not be confused with 3D packaging which has been in use for years and saves space by stacking separate chips in a single package. 3D packaging, known as System in Package (SiP), does not integrate the chips into a single circuit. In particular, the chips in the SiP communicate with off-chip controls much as if they were mounted in separate packages on a normal circuit board.
In contrast, the 3D IC 110 acts as a single chip where all the components on the different layers communicate with on-chip controls, whether vertically or horizontally. There are many advantages associated with 3D IC packaging that can help extend the performance of Moore's law and possibly extend the performance even more than predicted by Moore's law. These advantages can include:
1. Size—3D IC 110 has a much smaller footprint when compared to similar integrated circuits (ICs) packaged with a different technology. FIG. 2 (PRIOR ART) illustrates a 32 Gigabyte (GB) standard eight IC design 202 which is four inches long and located on a traditional circuit board and a side view of a commercially-available 32 GB 3D memory stack 204 having eight IC's each being 55 micrometers (μm) thick and 1 millimeter (mm) wide.
2. Speed—with propagation delay varying with the square of the wire length, a much shorter interconnect distance means much faster speeds for the 3D IC 110 when compared to ICs packaged with different technologies.
3. Power—results of 10× improvement in power consumption due to increased efficiency and shorter path lengths have been seen with 3D IC 110 when compared to ICs made by other packaging technologies.
4. Cost—highly complex (expensive) IC's can be broken into several sections meaning that a defect will affect a much smaller portion of the completed 3D IC 110 when compared to ICs of other packaging technologies.
Therefore, the semiconductor industry has undertaken a well documented and aggressive approach to develop and implement this emerging 3D IC packaging technology. To implement this technology and stack IC's in a 3D IC package, the silicon wafer needs to be thinned to much less than the standard silicon wafer thicknesses—from around 700 μm to around 50 μm-60 μm (see FIG. 2). This requirement for thinned silicon is clearly documented in the ITRS roadmap as well. Hence, there is no question that silicon wafers will be required to be thinned to tens of micrometers to utilize 3D IC packaging.
To thin the silicon wafer to this thickness, a support wafer or carrier is temporarily bonded to the silicon wafer to provide mechanical integrity while the excess silicon is removed from the silicon wafer. The support wafer can be made of two different substrates, namely silicon or glass. The glass wafer has emerged as the dominant carrier, due not only to cost reasons but also due to the inflexible coefficient of thermal expansion of the silicon carrier, the inability to inspect the quality of bond between the silicon wafer and the silicon carrier and further due to the severe restrictions on the form factor of the silicon carrier. With respect to the severe restrictions on the form factor of the silicon carrier, the silicon carrier is only available economically in exact diameters as the silicon wafer to be thinned while it is desired to have carriers with a slightly larger diameter than the silicon wafer to be thinned. The reason for the severe restrictions on the form factor of the silicon carrier is that the semiconductor industry is tooled for very precise silicon wafer dimensions to be able to utilize the major semiconductor companies' lithography equipment. Therefore, the supply chain is not configured to supply silicon carriers at reasonable costs that are even one mm larger in diameter than the standard silicon wafer. The carrier should have a larger diameter than the silicon wafer to be thinned because in the thinning (grinding, polishing) process mechanical support should extend beyond the edge of the silicon wafer being thinned. Furthermore, the most widely used thinning systems in development by the major semiconductor companies require a bonding system which utilizes an ultraviolet (UV) light source to adhere the bonding agent located between the carrier and the silicon wafer, and a laser to remove the bonding agent after the thinning process. Since, silicon wafers do not transmit UV light or laser beams this means that glass wafers will be widely used for many thinning systems.
For a glass wafer, there are at least two physical attributes that have historically been difficult and cost prohibitive to achieve simultaneously. These two physical attributes are:
I. Total Thickness Variation (TTV)—the TTV of the silicon wafer to be thinned can only be as good as the TTV of the glass carrier wafer. As the silicon wafer requirements become thinner, the TTV should be less than about 2.0 μm. Referring to FIG. 3 (PRIOR ART), there is a schematic diagram illustrating an exemplary 3D IC structure 300 which has a poor TTV that resulted in poor interconnects 302 between a top IC layer 304 and a bottom IC layer 306. Referring to FIG. 4 (PRIOR ART), there is a schematic diagram of a glass wafer 402 used to explain TTV which is defined to be the difference between a highest thickness (Tmax) elevation 404 and a lowest thickness (Tmin) elevation 406 on the entire surface 408 of the unclamped (free state) glass wafer 402.
II. Warp (flatness)—warp of the glass wafer is important for performance of the thinned silicon wafer. The warp should be less than about 60 μm. Referring to FIG. 5 (PRIOR ART), there is a schematic diagram of a glass wafer 502 used to explain warp which is defined as a sum of the absolute values of the maximum distances 504 and 506 which are respectively measured between a highest point 508 and a least squares focal plane 510 (dashed line) applied to a shape of the glass wafer 502 and a lowest point 512 and the least squares focal plane 510 (dashed line). The highest point 508 and the lowest point 512 are both with respect to the same surface of the glass wafer 502. The least squares focal plane 510 is applied to the shape of the unclamped (free state) glass wafer 502. The least squares focal plane 510 is determined by the following method. A plane is determined by the equation z=A+Bx−Cy. Then, the least squares planar fit is determined through matrix minimization of the sum of the squares of the deviations of the real data from the plane. This method finds the least squares values A, B, and C. The matrices are determined as follows:
      [                            n                                      Σ            ⁢                                                  ⁢                          x              j                                                            Σ            ⁢                                                  ⁢                          y              j                                                                        Σ            ⁢                                                  ⁢                          x              j                                                            Σ            ⁢                                                  ⁢                          x              j              2                                                            Σ            ⁢                                                  ⁢                          x              j                        *                          y              j                                                                        Σ            ⁢                                                  ⁢                          y              j                                                            Σ            ⁢                                                  ⁢                          x              j                        *                          y              j                                                            Σ            ⁢                                                  ⁢                          y              j              2                                            ]    *      [                            A                                      B                                      C                      ]    ⁢  z
By solving this equation for A, B, and C, the least squares fit is complete
To date, several different approaches have been made by the semiconductor industry in an attempt to cost effectively form a glass wafer that has both the desired TTV and warp attributes. One approach that has been used to meet the TTV and warp attributes is to polish the glass wafer. However, it is difficult to control both warp and TTV when polishing the glass wafer, as they frequently move counter to each other during the polishing process. Referring to FIG. 6 (PRIOR ART), there is a schematic of an exemplary glass wafer 602 that is polished as shown by line 604 to reduce warp but by reducing warp this will at the same time also increase the TTV. This schematic is not to scale and has been provided so one can readily see how warp and TTV are inter-related.
In addition, the polishing process creates micro-cracks in the surface of the polished glass wafer which leads to a reduced re-cycle rate of the polished glass carrier. Furthermore, the polishing process will not effectively scale to polish glass wafers with a 450 mm outer diameter which will be needed if the silicon wafers to be thinned increase from the largest current 300 mm outer diameter to the future 450 mm outer diameter as predicted by ITRS. This is because the costs for larger glass wafers will scale geometrically with size of the silicon wafer to be thinned because the thickness requirement will be the same but maintaining the same requirements for TTV utilizing the polishing approach will be far more difficult. Plus, fewer glass wafers can be made per polishing run which will also increase the costs as larger glass wafers also means fewer glass wafers can be made per run. Accordingly, there is a need to address these shortcomings and other shortcomings to provide a glass wafer that can be effectively used to thin a silicon wafer.