Emerging applications like learning systems, such as, for example, deep neural networks often need massive computational and memory abilities to train on different datasets and learn with high accuracy. Moreover, as applications like high-performance computing, graphics operations, etc. become data and compute intensive, energy-efficiency and low latency become critical. A technique known as “processing in memory” has the ability to address these challenges by scheduling complex operations on memory (e.g., dynamic random access memory (DRAM), etc.) logic dies to provide additional compute abilities, in a lower-power technology process and also closer to where the data resides.
High Bandwidth Memory (HBM) is a high-performance random access memory (RAM) interface for 3D-stacked memories (e.g., DRAM). It is often used in conjunction with high-performance graphics accelerators and network devices which access large datasets. HBM generally achieves higher bandwidth while using less power in a substantially smaller form factor than other DRAM technologies (e.g., double data rate fourth-generation synchronous dynamic random-access memory (DDR4), double data rate type five synchronous graphics random-access memory (GDDR5), etc.). This is often achieved by stacking a number (e.g., eight) memory dies together. Frequently this stack also includes an optional base die with a memory controller. The dies may be interconnected by through-silicon vias (TSV) and microbumps.