1. Field of the Invention
The present invention relates to a computer, and more specifically to a cache memory contained in a computer for effectively preloading array data of a matrix from a main memory to the cache memory.
2. Discussion of the Related Art
In a computer, a cache memory is located between a processor and a main memory. The cache memory stores a portion of the data stored in main memory, along with corresponding addresses of the portion of data from the main memory. The capacity of the cache memory is typically small, compared to that of main memory, but its access speed is comparatively quite high. A processor reads data from the cache memory, processes the data, and writes the processed data into the cache memory. Access time (both read and write) of the computer can thus be reduced, in comparison with the access time required to access the main memory directly.
When the cache memory receives an access request (or access address) from the processor, the cache memory first checks whether the data corresponding to the access address are stored in the cache memory or not. If the data are not stored in the cache memory, the cache memory sends the access address to the main memory, and transfers the respective data corresponding to the respective access address from the main memory back to cache memory. In related devices, one block of data, including the data which corresponds to the access address, are transferred and stored in the cache memory. For example, one block of data usually consists of four sets of data corresponding to four continuous addresses. In this case, if the processor accesses continuous addresses in order, the processor will read continuous data from the cache memory in order. However, if the processor accesses address in a dispersed manner, a frequent result is that data corresponding to the respective access address are not stored in the cache memory. The cache memory must then transfer the required data from the main memory at that time.
The processor often executes a matrix calculation by sequentially reading out an array of data in a matrix format from the cache memory. The address area of this array of data is usually longer than the capacity of the cache memory permits. Accordingly, tile offset (interval) of the access address of the array of data is very large.
For example, the processor will read out an array of data corresponding to address 0. Next, the processor will read out an array of data corresponding to address 64. Third, the processor will read out an array of data corresponding to address 128. In this case, the offset of the access address (or address interval) is 64. However, the array of data corresponding to the address area (0.about.128) cannot be stored in the cache memory at one time, because, as mentioned earlier, the capacity of the cache memory is quite small. Therefore, whenever the processor accesses the array of data from the cache memory, the cache memory must first transfer the array of data from main memory. Obviously, some time is consumed during this transfer. During the transfer period, the processor is forced to wait, without being able to process data.
To prevent the above-mentioned defect, when the cache memory does not store the data corresponding to the respective access address, a method has been considered of having the cache memory transfer a block which includes the data corresponding to the respective access address, and also the next sequential data block. For example, if the present access address requested by the processor is 0, a first data block corresponding to address (0.about.3), along with the next data block corresponding to address (4.about.7), are both transferred from main memory to cache memory. This device operates on a prediction that the next data block will be accessed following the first data block. That is, in this device, the next data block neighboring the first access address is simply transferred from main memory to cache memory. As mentioned above, if the processor accesses an array of data of a matrix sequentially, the offset of the access address of the array data will also be large (for example, 64, 128). Accordingly, the next data block neighboring the first access address may not include the array of data which the processor needs to access next.
Moreover, in this device, the cache memory transfers the next block of data with the first block of data, including data corresponding to the present access address, only when the cache memory does not already store the data corresponding to the present access address. In short, if the cache memory happens to already store the data corresponding to the present access address, the cache memory will not transfer the neighboring data which may be accessed in the future. Therefore, even if the cache memory unexpectedly stores data corresponding to the present access address, it often happens that the cache memory does not store the data corresponding to the next access address when the processor needs to access the next array of data.
As also mentioned above, in the prior device, when the processor sequentially accesses the array of data of a matrix from the cache memory, it often happens that the cache memory does not store the array of data whose offset is large, and the cache memory must transfer the array of data from the main memory.