1. Technical Field
The present invention relates generally to data processing systems and more particularly to fetching data for utilization during data processing. Still more particularly, the present invention relates to data prefetching operations in a data processing system.
2. Description of Related Art
Conventional computer systems are designed with a memory hierarchy comprising different memory devices with increasing access latency the further the device is away from the processor. The processors typically operate at a very high speed and are capable of executing instructions at such a fast rate that it is necessary to prefetch a sufficient number of cache lines of data from lower level (and/or system memory) to avoid the long latencies when a cache miss occurs. This prefetching ensures that the data is ready and available when needed for utilization by the processor.
Conventional prefetch operations involve a prefetch engine that monitors accesses to the L1 cache and, based on the observed patterns, issues requests for data that is likely to be referenced in the future. If the prefetch request succeeds, the processor's request for data will be resolved by loading the data from the L1 cache on demand, rather than the processor stalling while waiting for the data to be fetched/returned from lower level memory.
In conventional processor configurations, the effective address of a prefetch instruction (or a memory access instruction, such as a demand load) passes through a translation mechanism, such as a translation lookaside buffer (TLB), which translates the effective addresses into corresponding real addresses. The TLB then passes the real addresses to the prefetch engine to execute the prefetch at the lower level memory.
Within lower level memory, data are stored in memory blocks and addressed by real addresses. Sequential data are typically stored in sequential memory blocks, which are accessed by their corresponding sequential real addresses. Also, a configurable number of these sequential memory blocks are stored in memory pages, which pages are separated by known address boundaries. While sequentially adjacent pages have sequential real address assignments from page-to-page, an executing program's allocation of effective addresses (for processor operations) does not necessarily match up to a same sequential allocation. Programs that have sequential streams of data typically access the data in a linear manner in the effective address space. Thus, it is quite common for a pair of sequential effective addresses at a page boundary to correspond to real addresses on pages that are not sequentially adjacent to each other (i.e., the real address are not sequential).
Typically, when prefetching data, the prefetch engines utilize some set sequence to identify a stream of cache lines to be fetched and a stride pattern. A “prefetch stream” may refer to a stream of addresses (and blocks associated with those addresses) that are prefetched into the cache as a result of the detected prefetch pattern. When prefetching data using prefetch streams, the memory controller sources the data sequentially from a memory page using sequential real addresses. The sequential real addresses may however, cross page boundaries, resulting in the prefetch engine stopping the stream. The prefetch engine stops the stream because the prefetch engine has no way of determining if the next data block found sequentially in the physical address space is mapped to a correspondingly sequential block in the effective address space. To reduce potentially polluting the cache with non-sequential prefetches, the prefetcher will stop issuing prefetch requests at each physical page boundaries.
When the real addresses within a stream crosses over the boundary of a physical page of memory, the prefetch engine stops the stream at the boundary because the effective addresses that target adjacent memory pages are not necessarily assigned in sequence. If the prefetch engine were to continue across the boundary, based on the sequential effective addresses, the prefetch engine may begin prefetching data that does not really belong to the current stream. Thus, with conventional implementations of prefetch engines, the prefetch engine stops a stream when the stream crosses a page boundary. Then, the prefetch engine may later detect/initiate a new, different stream to prefetch the remaining data that will be demanded by the processor.