1. Technical Field
The present invention relates to data processing and, in particular, to a data processing system having an improved shared cache system.
2. Description of the Related Art
Computer systems generally include one or more processors and system memory, which may be implemented, for example, with Dynamic Random Access Memory (DRAM). Because of the disparate operating frequencies of the processor(s) and DRAM, computer systems commonly implement between the processor(s) and system memory one or more levels of high speed cache memory, which may be implemented, for example, in Static Random Access Memory (SRAM). The cache memory holds copies of instructions or data previously fetched from system memory at significantly lower access latency than the system memory. Consequently, when a processor needs to access data or instructions, the processor first checks to see if the data or instructions are present in the cache memory. If so, the processor accesses the data or instructions from the cache rather than system memory, thus accelerating throughput.
Modern cache memories can serve multiple processor cores or hardware threads of execution and may have to handle many access requests at a given time. To ensure proper operation, the access requests cannot be permitted to interfere with one another by, for example, requesting the same memory address and, hence, the same cache entry. To prevent this, prior cache systems have compared incoming request addresses with those of in-flight requests being processed. In particular, each in-flight request is assigned a dedicated bank of latches, and each incoming request address is compared against each in-flight address held in the latches.
Next-generation shared caches will be required to process hundreds or even thousands of concurrently executing transactions. Current cache designs, however, cannot scale to such large numbers of concurrent requests. That is, extension of current practice to handle such large numbers of concurrent requests requires too many latches and comparators and too much die space to be practical for high-throughput shared memory systems.