1. Field of the Invention
The present invention relates to compilers for computer systems. More specifically, the present invention relates to a method and an apparatus for performing locked prefetch scheduling in general cyclic regions of a computer program within an optimizing compiler.
2. Related Art
Advances in semiconductor fabrication technology have given rise to dramatic increases in microprocessor clock speeds. This increase in microprocessor clock speeds has not been matched by a corresponding increase in memory access speeds. Hence, the disparity between microprocessor clock speeds and memory access speeds continues to grow, which can cause performance problems. Execution profiles for fast microprocessor systems show that a large fraction of execution time is spent not within the microprocessor core, but within memory structures outside of the microprocessor core. This means that the microprocessor systems spend a large fraction of time waiting for memory references to complete instead of performing computational operations.
In order to remedy this problem, some microprocessors provide hardware structures to facilitate prefetching of data and/or instructions from memory in advance of where the instructions and/or data are needed. Unfortunately, existing techniques typically target structured program loop constructs and rely on the computation of an “ahead” distance, which is used to control how far ahead the prefetches are of the target loads or stores.
This technique works well for many cases, but it does not have the coverage and precise control over prefetch that is needed to maximize performance. In many systems, a prefetch that is issued too early may be dropped if too many prefetches are outstanding so the system cannot accept a new prefetch request. Conversely, a prefetch that is issued too late may lead to under-utilization of the central processing unit.
Hence, what is needed is a method and an apparatus for controlling prefetches without the problems cited above.