This application relates generally to processing systems, and, more particularly, to controlling the aggressiveness of prefetchers in processing systems.
Many processing devices utilize caches to reduce the average time required to access information stored in a memory. A cache is a smaller and faster memory that stores copies of instructions or data that are expected to be used relatively frequently. For example, central processing units (CPUs) are generally associated with a cache or a hierarchy of cache memory elements. Other processors, such as graphics processing units or accelerated processing units, can also implement cache systems. Instructions or data that are expected to be used by the CPU are moved from (relatively large and slow) main memory into the cache. When the CPU needs to read or write a location in the main memory, it first checks to see whether a copy of the desired memory location is included in the cache memory. If this location is included in the cache (a cache hit), then the CPU can perform the read or write operation on the copy in the cache memory location. If this location is not included in the cache (a cache miss), then the CPU needs to access the information stored in the main memory and, in some cases, the information can be copied from the main memory and added to the cache. Proper configuration and operation of the cache can reduce the average latency of memory accesses to a value below the main memory latency and close to the cache access latency.
A prefetcher can be used to populate the lines in the cache before the information in these lines has been requested from the cache. The prefetcher can monitor memory requests associated with applications running in the CPU and use the monitored requests to determine or predict that the CPU is likely to access a particular sequence of memory addresses in the main memory. For example, the prefetcher may detect sequential memory accesses by the CPU by monitoring a miss address buffer that stores addresses of previous cache misses. The prefetcher then fetches the information from locations in the main memory in a sequence (and direction) determined by the sequential memory accesses in the miss address buffer and stores this information in the cache so that the information is available before it is requested by the CPU. Prefetchers can keep track of multiple streams and independently prefetch data for the different streams.
SUMMARY OF EMBODIMENTS
The following presents a simplified summary of the disclosed subject matter in order to provide a basic understanding of some aspects of the disclosed subject matter. This summary is not an exhaustive overview of the disclosed subject matter. It is not intended to identify key or critical elements of the disclosed subject matter or to delineate the scope of the disclosed subject matter. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is discussed later.
A cache is a smaller and faster memory that stores copies of instructions or data that are expected to be used relatively frequently. Many processing devices utilize caches to reduce the average time required to access information stored in a memory. Lines can be retrieved from the memory and stored in the cache in response to a cache miss. A prefetcher can also be used to populate the lines in the cache before the information in these lines has been requested from the cache. Thus, at least two different types of processes—fetching and prefetching—can be used to populate the lines in a cache. Fetching typically refers to the process of retrieving a cache line from memory in response to a cache miss. Pre-fetching typically refers to the process of retrieving cache lines from memory that are expected to be requested in the future, e.g., based on a pattern of previous cache misses. The two types of processes may conflict with each other in some circumstances. The disclosed subject matter is directed to addressing the effects of one or more of the problems set forth above.
In some embodiments, a method is provided for controlling the aggressiveness of a prefetcher based upon thrash events. Some embodiments of the method include controlling an aggressiveness of a prefetcher for a cache based upon a number of thrashed cache lines that are replaced by a prefetched cache line and subsequently written back into the cache before the prefetched cache line has been accessed.
In some embodiments, an apparatus is provided for controlling the aggressiveness of a prefetcher based upon thrash events. Some embodiments of the apparatus include a thrash detector configurable to control an aggressiveness of a prefetcher for a cache based upon a number of thrashed cache lines that are replaced by a prefetched cache line and subsequently written back into the cache before the prefetched cache line has been accessed.
In some embodiments, a computer readable media is provided that includes instructions that when executed can configure a manufacturing process used to manufacture a semiconductor device configurable to control the aggressiveness of a prefetcher based upon thrash events. Some embodiments of the semiconductor device include a thrash detector configurable to control an aggressiveness of a prefetcher for a cache based upon a number of thrashed cache lines that are replaced by a prefetched cache line and subsequently written back into the cache before the prefetched cache line has been accessed.
While the disclosed subject matter may be modified and may take alternative forms, specific embodiments thereof have been shown by way of example in the drawings and are herein described in detail. It should be understood, however, that the description herein of specific embodiments is not intended to limit the disclosed subject matter to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the appended claims.