The present invention relates to a prefetch queue provided for an external cache memory in a processor.
Prefetching is a known technique implemented in processor devices. Prefetching causes data or instructions to be read into the processor before it is called for by the processor""s core execution unit (xe2x80x9ccorexe2x80x9d). By having the data available within the processor when the core is ready for it, the core need not wait for the data to be read from slower external memories. Instead, the data is available to the core at the relatively higher data rates of internal buses within the processor. Because prefetching can free a core from having to wait for an external bus transaction to be completed before the core can use the requested data, prefetching can improve processor performance.
If implemented incorrectly, however, prefetching can impair processor performance. By reading data from external memories into the processor, prefetch operations occupy resources on the external bus. Due to the limited size of the core cache, prefetching may write data over other data that the processor may use. Further, prefetching may read data into the processor that may never be used. Thus, prefetching is useful only if it improves processor performance more often than it impairs such performance. Instruction streaming, a type of prefetching, occurs when a core causes data to be read sequentially from several adjacent positions in external memory. Instruction streaming suffers from the above disadvantages.
It is known that prefetching may provide significant performance improvements when a processor either executes instructions or manipulates data held in adjacent memory locations. However, no known prefetching scheme adequately distinguishes programs that perform sequential memory reads from those that perform non-sequential memory reads. Further, many processors, particularly out-of-order superscalar machines, tend to perform several interlaced sequential reads xe2x80x9cin parallel.xe2x80x9d They may read data from sequential memory positions in a first area of memory interspersed with reads from sequential memory positions in a second area of memory. Traditional prefetching techniques do not recognize multiple streams of sequential memory reads as appropriate for prefetching.
Accordingly, there is a need in the art for a prefetch scheme that prefetches only when there exists a pattern demonstrating that performance improvements are to be obtained by prefetching. There is a need in the art for a prefetch scheme that incurs low performance costs for erroneous prefetches. Further, there is a need in the art for a prefetch scheme that detects and observes parallel prefetch operations.
Embodiments of the present invention provide a prefetch queue for an agent that can detect request patterns in both an ascending direction in memory and a descending direction in memory. Having detected a request pattern and a direction, the prefetch queue requests data from a next memory location in the direction.