I. Field of the Disclosure
The technology of the disclosure relates generally to dynamic random access memory (DRAM) management, and, in particular, to management of DRAM caches.
II. Background
The advent of die-stacked integrated circuits (ICs) composed of multiple stacked dies that are vertically interconnected has enabled the development of die-stacked dynamic random access memory (DRAM). Die-stacked DRAMs may be used to implement what is referred to herein as “high-bandwidth memory.” High-bandwidth memory provides greater bandwidth than conventional system memory DRAM, while providing similar access latency. In some implementations, high-bandwidth memory may also be “near” memory, or memory that is physically located closer to a memory interface than other system memory DRAM. High-bandwidth memory may be used to implement a DRAM cache to store frequently accessed data that was previously read from a system memory DRAM and evicted from a higher level cache, such as a Level 3 (L3) cache as a non-limiting example. Providing a DRAM cache in high-bandwidth memory may reduce memory contention on the system memory DRAM, and thus, in effect, increase overall memory bandwidth.
However, management of a DRAM cache in a high-bandwidth memory can pose challenges. The DRAM cache may be orders of magnitude smaller in size than system memory DRAM. Thus, because the DRAM cache can only store a subset of the data in the system memory DRAM, efficient use of the DRAM cache depends on intelligent selection of memory addresses to be stored. Accordingly, a DRAM cache management mechanism should be capable of determining which memory addresses should be selectively installed in the DRAM cache, and should be further capable of determining when the memory addresses should be installed in and/or evicted from the DRAM cache. It may also be desirable for a DRAM cache management mechanism to minimize impact on access latency for the DRAM cache, and to be scalable with respect to the DRAM cache size and/or the system memory DRAM size.
Some approaches to DRAM cache management utilize a cache for storing tags corresponding to cached memory addresses, similar to how conventional caches may be managed. Under one such approach, all of the tags associated with a DRAM cache are stored in static random access memory (SRAM) on a compute die separate from the high-bandwidth memory. However, this approach may not be sufficiently scalable to the DRAM cache size, as larger DRAM cache sizes may require larger area for tags that are not desired and/or are too large to store in SRAM. Another approach involves locating the tags within the DRAM cache itself, instead of within the SRAM on the compute die, and using a hit/miss predictor to determine whether a given memory address is stored within the DRAM cache. While this latter approach minimizes the usage of SRAM in the compute die, any incorrect predictions will result in data being read from the system memory DRAM. For example, if the hit/miss predictor incorrectly predicts that the memory address is located in the DRAM cache, a latency penalty is incurred from an unnecessary read to the DRAM cache before reading the memory address from the system memory DRAM. Conversely, if the hit/miss predictor incorrectly predicts that the memory address is not located in the DRAM cache, an opportunity to avoid an unnecessary read to the system memory DRAM may be wasted. Unnecessary additional reads incur additional access latency, which may negate any performance improvements resulting from using the DRAM cache.
Thus, it is desirable to provide scalable DRAM cache management to improve memory bandwidth while minimizing SRAM consumption and latency penalties.