1. Field of the Invention
The present invention relates generally to the field of cache systems and, more specifically, to using a data cache array as a DRAM load/store buffer.
2. Description of the Related Art
One element of a memory subsystem within certain processing units is a Level 2 Cache memory (referred to herein as “L2 cache”). The L2 cache is a large on-chip cache memory that serves as an intermediate point between an external memory (e.g., frame buffer memory) and internal client of the memory subsystem (referred to herein as the “clients”). The L2 cache temporarily stores data that the clients are reading from and writing to the external memory which is often a DRAM.
In such a system, coherency has to be maintained between data present in the L2 cache and the data stored in the external memory. “Dirty data,” that is, data transferred from a client to the L2 cache during a write operation, needs to remain in the on-chip until the data has been “cleaned,” by replicating the data in the external memory. During a read operation, memory space is allocated on-chip to receive the result data from the external memory. Applications that require high data throughput, such as graphics processing, will require considerable amounts of storage space for dirty data and read returns. If a system lacks sufficient storage for these operations, then overall performance will be degraded.
One approach to address these problems is to use distinct load/store data buffers separate from the main L2 cache that act as holding areas for data being transmitted to or received from the external memory. These data buffers are typically FIFO (first-in-first-out) stores and service data reads and writes in the order the operations are received from the L2 cache or the external memory. When the L2 cache receives a read request, the L2 cache allocates memory space in the load data buffer that should receive the result data from the external memory. The load data buffer stores the result data until the L2 cache is ready to receive the result data. In the case of a write operation, the L2 cache receives data from a write client and some time later copies the data to the store buffer in preparation for transfer to the backing store. For a write-through cache, this copy happens immediately; for a write-back cache, this happens upon eviction. In either case, the write data buffer holds the dirty data until the external memory has stored the data.
One drawback to this approach is that the amount of dedicated space allocated to the read and write data buffers is proportional to the throughput of data in the system. Since many systems, like graphics processing systems, require very high throughput, implementing intermediate read and write buffers in such systems consumes a large amount of memory space, making such a solution undesirable.
As the foregoing illustrates, what is needed in the art is an effective data caching mechanism for loading and storing data from and to external memory.