Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
There is a trend toward large-scale chip multiprocessors that include a relatively large number of processor cores, with core counts as high as hundreds or thousands envisioned in the near future. Such processors can greatly reduce processing time for applications that have high levels of concurrency, e.g., applications in which multiple computations can be executed simultaneously or in parallel with each other. However, as this trend continues, efficient use of all processor cores in high core-count chip multiprocessors may become more difficult, since threshold voltage can no longer be scaled down without exponentially increasing the static power consumption incurred due to leakage current in the chip multiprocessor. As a result, the power budget available per core in high core-count chip multiprocessors may decrease in each future technology generation. This situation may result in a phenomenon referred to as the “power wall,” “utility wall,” or “dark silicon,” where an increasing fraction of a high core-count chip multiprocessor may not be powered at full frequency or powered on at all. Thus, performance improvements in such chip multiprocessors may be strongly contingent on energy efficiency, e.g., performance/watt or operations/joule.
Higher capacity on-chip cache has also been explored as a way to improve chip performance. For example, the last level cache on a multicore die has been implemented in dynamic random access memory (DRAM) rather than static random access memory (SRAM). DRAM may be six to eight times denser than SRAM, and therefore can have significantly greater capacity than a similarly sized SRAM array. This may be particularly advantageous in server chips, in which 50% or more of the die area can be dedicated to on-chip cache. Furthermore, three-dimensional stacking of DRAM chips in a processor chip package may allow one or more separate DRAM dies to be stacked on a logic processor die, thereby facilitating a very large DRAM storage near the processor. Another technology that achieves high capacity for on-chip cache includes MRAM (magneto-resistive RAM). DRAM is a volatile memory, while in some cases MRAM may be designed to be semi-volatile in order to lower write latency and energy.