1. Field of the Invention
This invention relates to the field of data processing. More particularly, the present invention relates to prefetching in a data processing apparatus.
2. Description of the Prior Art
In a data processing apparatus comprising processing circuitry and a memory, it may be time consuming for the processing circuitry to execute a memory access instruction. In particular, the processing circuitry must send a request to a memory device to access a particular memory address and retrieve the data located at that address. The memory device must then access the memory, retrieve the requested data and then forward the requested data back to the processing circuitry. This may take several processing cycles to occur, during which the processing circuitry may be paused or unable to proceed further. If the processing circuitry executes a number of memory access instructions, then the processing circuitry may be delayed by a period of time while the memory access instructions are handled. In order to help alleviate this problem, the data processing circuitry may use what is known as prefetching, in which a prediction is made regarding which memory addresses are most likely to next be requested. Data at those memory addresses is then fetched before being explicitly requested and the fetching may be carried out when parts of the system are otherwise unoccupied. Accordingly, if those memory addresses are subsequently requested, it may be possible for the processing circuitry to obtain the necessary data more quickly than if no prefetching had taken place. The need for the processing circuitry to pause while requested data is accessed may be reduced, thereby leading to an increased efficiency of the processing circuitry.
The prediction may use the principle of spatial locality. That is, if a thread accesses a particular memory address location, then it is likely that the same thread will issue a subsequent memory access instruction to a nearby memory address. One particular example of this is when a thread accesses a number of data elements, each occupying a fixed length, in an array. By examining the memory addresses to which two or more memory access instructions are issued, it may be possible to deduce a pattern in the memory addresses being requested. For example, given two memory addresses, it may be possible to determine a stride length, which represents the difference in memory address between two adjacent data elements. It may be logical to assume, therefore, that a future memory access instruction will be directed to the next adjacent data element, whose address is the sum of the previously accessed memory address and the stride length.
It will be appreciated that known prefetching approaches rely on memory being accessed in a predictable manner or pattern. If an insufficient number of memory access instructions are issued, for example in the case of a short-lived thread, or if memory access instructions are issued in a complicated or unpredictable manner, it may be impossible or very difficult to properly perform prefetching.