Conventional data processing systems typically rely on multi-level memory architectures to optimize access to data by microprocessors and other processing units. With a multi-level memory architecture, multiple levels of memories are provided, with each successive level of memory typically providing greater storage space but increased access latency. The highest level memories, which provide the lowest access latency but the smallest amount of storage space, are commonly referred to as cache memories, and are often integrated directly into a processing unit or disposed on the same integrated circuit device, or chip. One common memory architecture used in a data processing system that includes multiple processing units, for example, supplements a single shared main memory with one or more L1 caches and an L2 cache dedicated to each processing unit, an L3 cache that is shared by multiple processing units.
A multi-level memory architecture relies on spatial and temporal locality of data to minimize memory access latencies. Put another way, processing units typically and repeatedly access data located in similar regions of a memory address space. Therefore, by maintaining data that a particular processing unit needs to use in the highest level cache accessible by that processing unit, the amount of time required to retrieve that data is minimized. In contrast, whenever a processing unit attempts to access data that is not located in a cache memory, that access attempt is considered to “miss” the cache memory, and a performance penalty is incurred as the data is retrieved from a lower level of memory. Subsequent accesses to that data will then typically “hit” the cache memory, and the amount of time required to access the data will be reduced.
To facilitate the movement of data between different levels of memory, the data is typically organized into fixed size segments referred to as “cache lines.” Given the limited storage space in a cache memory, a cache memory desirably only stores cache lines that are currently being used, or likely to be used in the near future, by a processing unit. Moreover, whenever a new cache line needs to be stored in a cache memory that is already full, a cache line that is already stored in the cache memory will need to be replaced (“evicted”). Generally, an evicted cache line is written back to a main memory and/or one or more lower levels of cache memory if the cache line has been modified by the processing unit. Otherwise, if the evicted cache line has not been modified, the evicted cache line may simply be discarded.
Conventional data processing systems often support an ability to stream data from an I/O device or other such hardware resource. This streamed data is typically communicated from the hardware device to a main memory of the data processing system over a shared bus, and one or more processing units then retrieve the data from the memory as it is needed. Particularly when the processing units are also coupled to the same shared bus, the communication of data from the I/O device to the main memory, and then from the main memory to the processing units, can occupy excessive bandwidth on the bus, and lead to decreased performance.
In an effort to alleviate these concerns, some conventional data processing systems support operations referred to as cache inject operations, where data that is being communicated from a hardware resource to a main memory is concurrently “injected” into a cache memory for immediate use by an associated processing unit. By doing so, the incoming data may be accessed and processed more quickly by the processing unit, while often reducing traffic on the shared bus.
However, while injecting cache into a cache memory of a processing unit may increase processing efficiency in some implementations, cache inject operations are often somewhat speculative in nature since the associated cache lines are not specifically requested by a processing unit in association with the processing unit attempting to access such cache lines. Given the limited amount of storage space in cache memories, and competition for this limited storage space, it has been found that in some instances injected cache lines may be prematurely evicted from the cache memory before the cache line is processed, and as a result, processing units that later attempt to access such prematurely evicted cache lines will experience cache misses and thus lower performance as the cache lines are re-retrieved from the main memory.
Therefore, a continuing need exists in the art for a manner of managing cache injection in cache memory of a processing unit to reduce the likelihood of premature eviction of injected cache lines.