Two-dimensional (2D) ICs have long been known to offer limited amount of bandwidth that can be delivered to a processor. For example, for data intensive workloads such as key-value stores, and distributed graph analytics, the workload's working set is often too big to fit in the central processing unit (CPU) caches. As a result, off-chip memory bandwidth largely dictates the speed of the computation. However, due to their planar design, heat dissipation for 2D ICs is simplified.
Two-and-one-half (2.5D) and three-dimensional (3D) ICs are composed of multiple die-bonded layers. In this case, memory bandwidth, for example, can be sizably increased at significantly reduced energy cost. However, one of the principal challenges is that heat generated in the hot components of one layer heats up the layers directly above and/or below it. This transference of heat makes 3D design challenging because it limits the types of circuits that can be stacked on top of one another. For instance, when considering a processor-in-memory architecture, where the 3D memory is directly stacked on top of a separate logic-layer die, the ideal operating temperatures for the memory technology limit what kind of logic can be incorporated in the logic layer. For instance, the complexity of the logic in a 3D design can be severely curtailed by the challenge that a hot arithmetic logic unit (ALU) in a logic layer of the 3D IC has the potential to corrupt the bits in dynamic random-access memory (DRAM) layers that are stacked above it, which necessitates increasing the refresh rate of the DRAM.
Accordingly, in terms of heat dissipation for 2.5D and 3D ICs, the common practice is to mount an air-cooled heatsink on the IC, have liquid cooled loops, or to immerse the IC in an electrically insulating, non-conductive liquid such as mineral oil. For example, recent trends in data centers make use of liquid cooling of servers as well as full-server immersion in a non-conductive fluid such as mineral oil in order to dissipate heat. Although liquid cooling allows for all components within a server chassis to be cooled as the fluid passes over, it does not provide the pinpoint accuracy needed to choreograph dispersion of heat at the scales that can be present in 3D die-stacked chips.