I. Field of the Disclosure
The technology of the disclosure relates generally to computer memory systems, and, in particular, to memory controllers in computer memory systems for providing central processing units (CPUs) with a memory access interface to memory.
II. Background
Microprocessors perform computational tasks for a wide variety of applications. A typical microprocessor application includes one or more central processing units (CPUs) that execute software instructions. The software instructions may instruct a CPU to fetch data from a location in memory, perform one or more CPU operations using the fetched data, and generate a result. The result may then be stored in memory. As non-limiting examples, this memory can be a cache local to the CPU, a shared local cache among CPUs in a CPU block, a shared cache among multiple CPU blocks, or a main memory of the microprocessor.
In this regard, FIG. 1 is a schematic diagram of an exemplary system-on-a-chip (SoC) 100 that includes a CPU-based system 102. The CPU-based system 102 includes a plurality of CPU blocks 104(0)-104(N) in this example, wherein ‘N’ is equal to any number of CPU blocks 104(0)-104(N) desired. In the example of FIG. 1, each of the CPU blocks 104(0)-104(N) contains two (2) CPUs 106(0), 106(1). The CPU blocks 104(0)-104(N) further contain shared Level 2 (L2) caches 108(0)-108(N), respectively. A system cache 110 (e.g., a Level 3 (L3) cache) is also provided for storing cached data that is used by any of, or shared among, each of the CPU blocks 104(0)-104(N). An internal system bus 112 is provided to enable each of the CPU blocks 104(0)-104(N) to access the system cache 110, as well as other shared resources. Other shared resources accessed by the CPU blocks 104(0)-104(N) through the internal system bus 112 may include a memory controller 114 for accessing a main, external memory (e.g., double-rate dynamic random access memory (DRAM) (DDR), as a non-limiting example), peripherals 116, other storage 118, an express peripheral component interconnect (PCI) (PCI-e) interface 120, a direct memory access (DMA) controller 122, and/or an integrated memory controller (IMC) 124.
As CPU-based applications executing in the CPU-based system 102 in FIG. 1 increase in complexity and performance, limitations on memory bandwidth may impose a constraint on the CPU-based system 102. If accesses to external memory reach memory bandwidth limits, the memory controller 114 of the CPU-based system 102 may be forced to queue memory access requests. Such queueing of memory access requests may increase the latency of memory accesses, which in turn may decrease the performance of the CPU-based system 102.
Memory bandwidth savings may be realized by employing memory bandwidth compression schemes to potentially reduce the bandwidth consumed by a given memory access. In particular, some memory bandwidth compression schemes may make use of a master directory to track a compression status for each memory line of a system memory. However, the master directory used by such memory bandwidth compression schemes may consume an unacceptably large portion of the system memory, may require implementation of additional logic, and/or may incur additional latency for any memory access request. Other memory bandwidth compression schemes may use spare bits in an error correcting code (ECC) field to indicate a compression status for each memory line of the system memory. While such mechanisms may avoid the need for a master directory, they may sacrifice some memory access performance, as the compression state for each line of system memory may not be determined until after an initial read of the line is complete. Thus, it is desirable to provide a memory bandwidth compression mechanism that avoids the drawbacks of using a master directory while providing improved memory access performance.