The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for latency-tolerant three-dimensional on-chip memory organization.
Three dimensional (3D) chip stacking technology allows multiple layers of dynamic random access memory (DRAM) to be integrated into a processor chip. In 3D chip stacking, the fabrication process includes stacking integrated circuits (ICs) with through silicon vias (TSVs) for communication between layers.
Due to physical limitations and constraints, different DRAM layers may have different access latency from the logic layer. Most modern microprocessors support cache lines much wider than the on-chip data bus. For instance, a Power7™ processor data bus width is 16 bytes while its cache line size is 128 bytes. Traditional memory organization uses multiple cycles to read a cache line from a set of DRAM banks with the same distance.