1. Field of the Invention
The present invention relates to a computer system, a preload controller and method for controlling a preload access of data to a temporary memory, and a program.
2. Description of the Related Art
The performance of a processor that makes arithmetic processes has been rapidly improving along with the advance of the internal structure of pipeline processes, and the advance of the semiconductor techniques. By contrast, the improvement of the performance of a main memory that stores data used in the arithmetic processes falls behind that of the processor in terms of the data supply speed, and the data supply speed to the processor has not caught up with the arithmetic processing speed. For this reason, many computer systems normally comprise a temporary memory of data called a “cache” so as to absorb the speed difference between the processor and main memory device.
The cache is a device which requires higher manufacturing cost per circuit scale than the main memory, but has a higher data supply speed than that of the main memory and is indispensable to bring out the arithmetic processing performance of the processor.
When an access (read request) to data stored in the main memory device occurs, the cache is referred to in place of the main memory device. At this time, if requested data is cached, the data is read out from the cache and is sent to the processor. On the other hand, if the requested data is not cached, the requested data is transferred from the main memory to the cache via a system bus, and is then supplied to the processor. The reason why the data is transferred to the cache first is to prepare for the next access to identical data. A ratio indicating if data is cached when an access to the data stored in the main memory occurs is called a cache hit ratio. The cache hit ratio is one of the parameters that reflect the performance of the cache.
In recent years, requirements for the cache performance become stricter. Especially, such trend is conspicuous in the application fields that process data in large quantities at high speed such as science technology calculations, multimedia processes, and the like. As described above, since the cache requires high cost, the storage size of the cache must be reduced in terms of cost. Inevitably, the use of the cache is limited to temporary storage of data. Therefore, the required data is not always cached. For example, if data associated with an access request is not always cached, i.e., if the cache hit ratio is too low, the data supply speed lowers to the same level as that when no cache is provided.
Conventionally, the cache performance is improved by adding a new scheme for making a so-called “preload” process for reading out data required for a process onto the cache in advance. Some implementation methods of the preload process are known; for example, (1) a method that uses a prefetch command, and (2) a method that predicts access patterns.
In the former method that uses a prefetch command, a prefetch command that specifies the address of data to be accessed is inserted onto a program. The prefetch command is a command for reading out data of the designated address onto the cache in advance. When the prefetch command is executed ahead of a command that uses data, the data is prefetched before the data is used actually, and is prepared on the cache in advance.
In the latter method that predicts access patterns, a future access pattern is predicted on the basis of past data access patterns (history), thus executing a preload process. When data accesses are continuously made on addresses (e.g., accesses to sequence data), the addresses of data to be accessed are monitored, thus predicting the address of next data to be accessed. According to this prediction result, the preload process of data which will be required in the future is executed. In the same manner as in the former method, when data is to be used actually, the data is prepared on the cache and is temporarily stored.
As described above, the performance can be improved by applying the preload scheme to the temporary storage, but the following problems are posed.
In the above described method (1) that inserts a prefetch command in a program, it is not practical to adjust and provide a program to cope with different performance levels of individual computer systems. Hence, a program must be inevitably optimized with reference to a given performance value. As a result, the performance difference among computer systems cannot be taken into consideration, and the degree of improvement in performance varies depending on the performance of computer systems.
In the above described method (2) based on prediction of access patterns, prediction has a limitation. For example, information such as which data access is important for a computation process, how many times that data access continues, and so forth cannot be predicted. In other words, there is an obvious bottleneck in regions other than access patterns that can be easily predicted.