Caches are used to improve processor core performance in systems where data accessed by the processor core is located in slow or far memory. A usual cache strategy is to fetch a line of data into a cache on any data request from the processor core that causes a cache miss. Fetching cache misses causes a degradation of an application cycle count. The degradation is caused by processor core cycles spent to bring the cache line from the memory to the cache. A standard approach to fix the problem is to include a software prefetch instruction in the code “before” memory access instructions that could cause a cache miss. The software prefetch instruction approach allows the data to be brought to the cache in the background. A disadvantage of the software prefetch instruction approach is that a programmer places the prefetch instructions in possible cache miss locations in the code, causing both an increase in the code size and uncontrolled cache pollution. Another standard approach is to use a hardware prefetch circuit that brings a next line from memory to the cache after any cache access, both hit and miss. The hardware approach is problematic for complex (i.e., nonsequential) patterns of processed data or programs.