Computer sub-systems such as three-dimensional (3D) graphics processors often include their own memory storage capacity implemented in one or more local memory caches (“local caches”). The local caches may store data such as the color and depth (Z) of a pixel or texture element (texel) and additionally may provide a storage queue for memory requests. Data stored in the local cache may be obtained via read requests, which are requests to extract data from a particular location in system memory. Data may also be expelled from the local cache back to system memory via write requests or write back requests which transfer the data to a particular location in system memory. Because the storage capacity of the local caches is generally limited, a request to read data from system memory (or a request for a future write) triggers a complementary request to evict (write back) data from the cache to make room for the data to be imported from system memory. Typically, the evicted data contains updated information that must be sent to system memory for storage. Therefore in such systems, read request cycles and requests for future writes are normally consecutively paired with write back cycles in the memory system bus traffic.
The sequential flow of read and write pairings causes sub-optimal performance because the system bus that delivers the read/write requests to memory has a period of down time whenever it switches between a read cycle and a write cycle, known, depending on the case, as a read-to-write bubble or a write-to-read bubble. FIG. 1 illustrates the bubbles in between an exemplary series of read and write requests. It is noted that the performance penalty for the bubble, measured in memory cycles, varies depending on the type of dynamic random access memory implemented in the system bus, such as RDRAM (Rambus DRAM) or SDRAM (synchronous DRAM).
An additional time lag is introduced by the local cache when it is forced to wait for a write back eviction to execute before issuing a read request. FIG. 2 illustrates a conventional sequence for a read request in a memory request allocation system 2. A read allocator 10 sends a read allocation request (step 1) to local cache 20 to initiate a write back eviction process. The local cache 20, in turn, prepares an entry for data that will be retrieved upon fulfillment of the read request by evicting cached data and sending a write request to memory (step 2). The local cache 20 then waits for acknowledgment that the write request eviction has been executed. Upon receiving acknowledgment that the write request has been executed (step 3), the read request issues to memory (step 4).