Various processes and structures described in the related art address high level system integration such as Hoffman, et al., U.S. Pat. No. 6,033,931, one of a class of so-called “cube patents.” Hoffman, et al. discloses a three-dimensional microchip circuit assembly process that employs a three-layer dry film sandwich to prepare a stacked circuit cube. Bertin, et al. U.S. Pat. No. 5,563,086 discloses an integrated memory cube structure and method of fabrication in which stacked semiconductor memory chips are integrated by a controlling logic chip such that a more powerful memory architecture is defined with the functional appearance of a single higher level memory chip. Carson, et al., U.S. Pat. No. 5,347,428 describes a computer module in which a stack of glued together IC memory chips is structurally integrated with an IC microprocessor chip. Go, et al. U.S. Pat. No. 5,104,820 discloses a method of fabricating electronic circuitry units containing stacked IC layers having lead rerouting. Carson, et al., U.S. Pat. No. 4,646,128 discloses high-density electronic processing packages and structures and methods for manufacturing them.
The so-called “cube” structures described in these references is a result of a procedure also known in the art as chip stacking, but this has several drawbacks that include inter alia; edge connection architecture which leads to signal delay, lower input/output (I/O) density, difficulty in powering the system through edge connections, and difficulty in cooling the system for high power use.
Scaling of complementary metal-oxide-semiconductors (CMOS) transistor devices to smaller and smaller dimensions to enable larger circuit density is running into challenges in that the performance of such ultra small devices is not scaling favorably due to short channel effects in the device behavior, the difficulty in scaling channel strain induced mobility enhancements and the like. Additionally, with the increased logic circuit density, the demands for memory accessible by logic circuits with minimal delay and memory bandwidth to access a large segment of the memory at a given time are becoming paramount to achieving peak performance. This in turn drives two requirements. First, additional memory needs to be located close to the logic circuitry with fast access time and second, high bandwidth interconnects are required for the logic circuits to send and retrieve information from these memory cells on the chip thus driving a huge increase in interconnect density and speed.
In this regard, 3D integration (3DI) which represents a process for device integration at a system level is emerging as an option to bring heterogeneous devices together in close proximity so that they function as a homogeneous device. 3DI differs from the traditional 2D planar back-end-of-line (BEOL) integration in that 3DI adds an additional dimension, (Z) integration, which allows more devices from different sources, functionalities, and types to be integrated in a close proximity to form a single assembly which can function as an integrated system.
The 3DI approach allows more device content (memory for example) and fast access time to the various devices (shorter signal travel distance enabled through connections in the Z-direction) than the traditional 2D planar structure restricted to X-Y wiring only. This is very beneficial for system level performance since the accessible amount of memory within one clock-cycle distance can be greatly enhanced by shortening the physical distance between the processor and memory elements of the system. A clock-cycle distance is the distance that the signal can cover within one device clock-cycle. For today's devices running at over several GHz speed this distance is reduced to only several millimeters. In a 2D configuration more and more device contents have to be placed outside this distance. Thus more clock-cycles are needed to access them during complex operations requiring a large amount of memory to be retrieved, processed, and stored back. This in turn translates into slower data processing speed at the system level although the individual elements of the system—processor and memory—are capable of higher speed of operation.
3D integration places the additional contents such as memory cells in the third dimension (by Z-stacking), and therefore increases the amount of accessible device functionality within the critical single clock-cycle zone. In addition to more memory content within the clock-cycle zone, 3DI also allows additional and disparate components such as SiGe, III-V devices, optoelectronics, MEMS and the like to be integrated as part of the system on a single assembly level. As these components are typically fabricated on different substrates using processes which may be incompatible with currently practiced silicon CMOS processes, they cannot be embedded into a silicon chip using 2D process methodologies. Thus such components tend to be integrated with CMOS using chip carriers or circuit boards as a means to interconnect them which can limit the ability to fully utilize the capability of the components.
Of the process format, 3DI can be further separated into wire bonded chip stacks and through silicon via (TSV) based chip stacks. The wire bonded 3DI mainly focuses on lower density and count input/output (I/O). Typically dozens to hundreds of I/O's, and are used for systems where high contents and lower power within a given footprint are the key considerations for the consumer markets. These typically use wire bond connections at the periphery of the stacked wafers to achieve I/O connections. The TSV 3DI on the other hand tends to focus on high performance systems where I/O count is over several thousands and high speed (>2 GHz) processors are used where the system clock-distance becomes a key requirement. In this high performance application space, through Si via connections become a dominant factor by enabling shorter vertical connections to reduce distance between the devices
For most chip-level 3DI, chip stacking is used along with the provision of device I/O fanned out to edge leads. The edge leads are then connected with wire bonds to edge pads on a logic chip of a larger size placed at the bottom of the chip stack. Due to such a connection scheme, chip-level connection typically enables more content than 2D but the access time between devices is limited by inductive and capacitive delays associated with the bonded wire connections and going to the edges of the chips. Also it is difficult to conveniently deliver power to the various chips in the stacked assembly.
3DI with through-Si-via connection allows the integration at wafer level and offers a higher I/O density and a Z-connection with reduced parasitics compared to wire bond connections. Through-Si-via processes for 3DI can also be further separated into via-first and via last approaches. Via-first as the name implies comprises embedding the through vias in the parent wafer(s) before devices are fabricated. This normally allows a higher wiring content since the I/O's do not go through the top device structure directly thus allowing more area for wiring. The level to level z-connections are typically done between capture pads on the through vias using metal compression bonds (using metals such as Cu—Cu, no solder, no adhesive), micro-C4 joining (solder, no adhesive), or transfer joining (T&J, metal compression bond supplemented by adhesive joining for strength referred to as hybrid bonding). Via-first connections typically enable an I/O density with pitch as small as about 25 to about 50 microns (um). The assembly methods described above can also be used for individual chips, and are not restricted to just wafer level 3DI schemes.
For 3DI with via-last approach, the wiring density is typically reduced relative to the via first approach due to the use of some of the wiring channels by the thru-vias which need to thread through the entire device stack to connect devices. However, since the thru-vias can be defined lithographically and filled, they are not limited by the 3D layer joining tolerances as in the case of the via first approach, via-last normally can have a higher via-density (under about 5 um pitch) than via-first approach (about 25 to about 50 um pitch).
In all 3DI integration schemes mentioned above, the cooling of the system is typically a difficult issue to resolve. The tighter stacking of devices generates more heat per unit volume but with reduced heat dissipation. Provision of micro-channels for cooling on the bulk silicon substrate of the assembly in the final 3DI stack can provide enhanced cooling but cannot completely achieve an effective cooling of upper layers when many device layers are stacked in the 3D system. Thus for both chip stacking and through-Si connection approaches, the heating power density increases as the number of 3DI devices increase. This limits the number of 3DI devices stackable into a system as the heat dissipation becomes a road block for further 3D content increase.
Another issue associated with 3DI is the thru-Si connection electro-static discharge (ESD) protection requirements. In any device design, an ESD protection circuitry is provided and linked to an I/O net. This protects sensitive devices from manufacturing process induced ESD. Since each wafer in a 3DI stack needs ESD protection, the final 3DI circuits will have to have a total ESD circuit allocation as large as the sum of all the devices in the 3DI structure. This can be a large load as the number of devices increases and requires a large driver to access the 3DI circuits which could significantly slow them down.
In general, the current thru-Si 3D wafer stacking processes and resultant devices present many processing related issues, e.g., thin Si construction (<100 um) requires stacking wafers one at a time to allow thru-Si vias; it is difficult to make the via less than 5 um in size and 10 um in pitch in devices employing Cu; thru-Si vias can be made from W but W has a higher resistivity than Cu; thru-Si vias pass through the bonding interface making bonding defects difficult to control; wafer stacks are limited due to bonding thermal cycles; the process is complex and introduces via yield and wafer yield issues; manufacturing involves long process cycles; wafer level distortions are introduced. Accommodation of thru-Si vias requires significant changes in the lay out of processor and memory chips in addition to leading to loss of useable silicon area available for device circuits. The chip cube approaches known in the current art which avoid thru-Si via related concerns, however are limited in their ability to provide a high bandwidth for data communication in and out of the structure and have high parasitics as they depend on edge leads or wire bonds formed after assembly.