Modern computing platforms include data processors (e.g., CPUs) that are integrated with caching subsystems. Such caching subsystems serve to reduce memory access latency as experienced by the processor when fetching contents of off-chip memory. Some platforms include data processors that fetch instructions and data in quanta that prospectively prefetch instructions and data, and place such prospectively prefetched instructions and data into cache (e.g., an instruction cache or a data cache, or a mixed instruction and data cache). In the case of instruction prefetch the inherent prefetch philosophy is that it is more likely than not that the next instruction to be executed would be found at the next higher address. This instruction look-ahead prefetch philosophy proves to be empirically true; that is, if a processor instruction at real memory address “A000” is currently being executed by the processor, then is it more likely than not that the next instruction to be executed will be at address “A000” plus 1. However, regarding data prefetching, the legacy look-ahead prefetch philosophy often fails in situations where real memory is dynamically allocated (e.g., using a dynamic memory allocation call such as malloc( )). Legacy memory allocation schemes operate under a best-fit philosophy and merely allocate an area of real memory without regard to whether or not the allocated area of real memory is contiguous to any previously allocated area of real memory.
In many applications (e.g., databases, networking, etc.) large areas of real memory are allocated dynamically during processing of the application, and in many such applications the application processing proceeds sequentially through the allocated memory. Unfortunately, since legacy memory allocation schemes operate without regard to whether or not the allocated area of real memory is contiguous to any previously allocated area of real memory, the processor's caching subsystem often prefetches data that is not so likely to be used. This has undesirable effects: (1) prefetched data may evict data that is frequently accessed by the application, thus at least potentially incurring undesirable memory latency; and (2) during prospective prefetch, data contents other than the next-to-be-accessed data is prefetched, which would at least potentially mean that memory fetch cycles are wasted, and also might mean that the processor will have to incur further memory fetches to retrieve the data that is in fact the actual next-to-be-processed data.
Some legacy prefectchers prefetch data found at the physical memory addresses corresponding to the ‘next’ virtual memory segment on the assumption that a prefetch to retrieve data corresponding the ‘next’ virtual memory segment is going to prefetch data that is likely to be used ‘next’. This assumption might be sometimes true and might be sometimes false. What is needed is a way to improve the likelihood that prefetching data found at the physical memory addresses corresponding to the ‘next’ virtual memory is indeed going to be the ‘next’ to be accessed data. One way to increase the likelihood that the ‘next’ to be accessed data is going to be used is (for example) to recognize what is next segment in a virtual memory space, and then manipulate memory pointers accordingly. Unfortunately, legacy techniques fail to recognize what constitutes a ‘next’ segment in a virtual memory space, and thus those legacy techniques exhibit lower than desired actual use of the prefetched data.
What is needed is a technique or techniques to improve over legacy approaches.