There are several devices that are limited by the interconnect density that exists within the same monolithic device. Examples of such devices include switch matrix devices, Field programmable gate arrays (FPGAs), traffic managers, etc. Such devices would benefit from a 3-dimensional (3D) implementation. For example, FPGAs typically include a configurable memory, such as an array of logic blocks, input/output (I/O) pads, and routing channels. The configurable memory is typically large in physical size, and seldom used. A logic block may include one or more logic cells including lookup tables (LUTs) combined with multiplexers and other elements. These can be configured in various multiplexing ratios. FPGAs may also include multiplexers used to reroute wires. Multiplexer capability must be balanced with available wiring to avoid implementations that are unroutable. The cost and yield impact of adding layers to an FPGA is a current limitation on performance of the FPGA. As such, the FPGAs are held back by a lack of availability of additional connection paths.
FPGAs further typically include an embedded memory, which requires a high bandwidth. For example, all buffering for all processing that may be occurring in FPGA may be done using the embedded memory. The embedded memory can be internal to a chip, or can sometimes be off-chip at the expense of increased latency and power consumption.
High bandwidth memory (HBM) is a 3D memory with multiple layers of die bonded together and vias extending through silicon, allowing for highest possible bandwidth with density of memory adequate to address the bandwidth. The die at a bottom of the multiple layers handles all external communication, and handles all address and control communications with the layers above it.
Stacked architectures have evolved from single-chip chip scale packages (CSP) to 2-dimensional (2D) package on package (PoP) and stacked CSP to thinner profile fanout wafer level CSP. However, the ability to stack further has been limited by size and power constraints, as well as cost. Further limits on stacked architectures include bandwidth and latency between devices in these formats. For example, semiconductor devices may be fabricated in different nodes, representing different distances between identical features in the device. For example, earlier devices included legacy nodes of 32 nm, 28 nm etc., and later devices include advanced nodes having much smaller critical dimensions. As the node decreases, the number of layers available increases. However, it is only possible to have a limited number of layers, and the layers in advanced nodes are expensive due to the depreciation of the newer equipment in these newer wafer foundries.