1. Field of the Invention
The present invention relates generally to cache memories in a data processing system and more particularly to the use of stored commands which optimize the content and sequence of data and instruction transfers from main memory to a cache memory.
2. Description of the Prior Art
High performance processors have used cache memory systems as an integral component of overall system design for many years. A cache memory typically has a much faster access time than main storage. For example, cache may make use of a relatively small number of high-speed data storage elements, located in close proximity to an associated processor, while main storage typically uses larger numbers of storage elements and is located at some distance from the processor. Cache memory systems have been designed to overcome the access speed limitation of main storage by providing rapid access to a relatively small set of data which is likely to be used during a relatively short time interval.
These cache memory systems have been designed to take advantage of two properties which have been observed empirically in data processing systems. The first of these properties is known as the spatial locality of reference. This property refers to the tendency of a program, during any relatively small time interval, to access data or instructions which have addresses in the main storage that differ by a relatively small value. Stated another way, this property holds that when a specific target word or datum is used by the processor, it is likely that the immediately adjacent data in the address space of main memory will be used close in time to the use of the target data.
The second property is known as the temporal locality of reference. This property refers to the tendency of a program, during a small time interval, to access the same data or instructions repeatedly. When a specific target datum is accessed by the processor, it is likely that this target will be accessed again within a predetermined time interval. Combining these two properties provides the basic rule for the use of cache memory. A cache system should contain the most recently used data plus the contents of memory addresses neighboring this data.
Transfers from memory to cache are made more efficient by fetching segments containing multiple data words rather than single target words. Up to a limit, the larger the segment fetched, the greater is the likelihood that the next reference to the cache memory will succeed. While the method of fetching data in segments has the advantage of increasing the likelihood of finding data in the cache memory when needed, it also causes a large increase in the total volume of data transferred from main memory to the cache memory. This volume of data can be a source of additional delay in the system if it is not properly managed. Larger segments increase the length of time needed to complete a memory-to-cache transfer. Longer transfer time means more delay before the processor can access the end of the segment. It may also delay the beginning of subsequent transfers between the main and cache memories.
In addition to the above, more elaborate methods have been developed to ensure that most data will be available in cache memory when accesses are requested. For example, U.S. Pat. No. 4,435,759 to Baum et al. relates to a hardware monitor which associates the address of an instruction with the address of the cache operand line miss generated by the instruction. Using this method in a pipelined data processing system, the monitoring system can look ahead and determine which line is likely to be accessed during the next execution of the same instruction. U.S. Pat. No. 4,441,155 to Fletcher et al. relates to a means of reducing the number of cache misses by making the congruence class activity more uniform, focusing on reduction of miss rate. U.S. Pat. No. 4,463,424 to Matteson et al. relates to means for reducing the cache miss rate by subdividing the cache and allocating appropriate sized sections to concurrently executing processes.
Other methods for managing cache utilization involve limits on storing data and instructions in cache so that some information passes directly from the processor to memory or from memory to the processor, without cache storage. Cache status or observed writing activity may be used to selectively inhibit data promotion from memory to cache. U.S. Pat. No. 4,429,363 to Duke, et al. relates to a method of restricting the storage of data in a cache memory system based upon cache status and the most recent memory reference. A memory hierarchy is controlled by monitoring a series of requests for access to memory and indicating when certain events occur. For example, when a record from a Direct Access Storage Device (DASD) (i.e., a disk drive) is not modified in the last portion that was referenced, the contents of this portion may be promoted to cache. Accessing data in a series of requests of this type tends to reduce data promotions. U.S. Pat. No. 4,463,420 to Fletcher relates to a method for selecting lines of data to be replaced in a cache memory system based on Task Identifiers and a method for the early cast-out of lines from the cache memory. U.S. Pat. No. 4,189,770 to Gannon and Liptay teaches a means of passing sequential portions of the cache line to instruction buffers (I-buffers) during an instruction miss.
The above-referenced patents employ criteria for determining when whole segments of data or instructions are to be stored into or excluded from a cache memory. The decision to fetch an entire segment in response to a cache miss results in cache activity which may extend beyond the time needed to transfer the data, the absence of which caused the cache miss. This extended cache activity may result in a decrease in potential cache system performance. The additional delays caused by transfer of contiguous data from memory to cache may negate a part of the performance improvement gained through the use of cache memory to exploit spatial and temporal locality of reference.
One method to improve system performance is to ensure that the line may be accessed, for example from a line buffer, while a line transfer from memory to cache is in progress. This makes the data transferred at the beginning of the segment available to the processor before the whole segment transfer is complete. U.S. Pat. No. 4,370,710 to Kroft relates to a cache memory organization in which cache memory is not locked up and continues responding to the flow of requests made upon it while awaiting the transfer of data from the main memory.
In that system, the data transferred includes the data which caused the miss and the block of data surrounding it. The block of data is transferred in the same order in which it is stored in main memory and portions of the block which have been transferred may be accessed by the processor prior to completion of the block transfer. The Kroft system does not, however, improve performance in the case where the data referenced immediately after the target which caused the miss is located in memory near or at the end of the segment which is transferred. This data will be delayed regardless of the ability to access data in the line buffer.