Due to the recent improvements in operating frequencies of processors, the access time from a processor to a memory has become longer relative to the processor operating frequency. For this reason, processors are now being equipped with small capacity and high speed memories called “cache memories” for shortening the access time from the processors to main storage devices. Here, as “processor”, there is CPU (central processing unit), DSP (digital signal processor), GPU (graphics processing unit), etc.
A cache memory is positioned at a higher hierarchy from the main storage device and holds part of the data stored by the main storage device. When a processor accesses data loaded in the cache memory (hereinafter referred to as a “cache hit”), since the cache memory is built into the processor or otherwise at a position closer to the processor than the main storage device, the processor can access the data concerned in a shorter time. On the other hand, when a processor accesses data not loaded in the cache memory (hereinafter referred to as a “cache miss”), it has to read out the data from a memory positioned at a lower level from the cache memory, so the access time to the data concerned becomes longer. For this reason, to prevent a cache miss from occurring, the memory controller of the cache memory operates to hold data with a high frequency of access from the processor in the cache memory and to expel data with a low frequency of access to a lower level memory.
As an algorithm for expelling data with a long period of nonuse on a priority basis to a lower level memory, the “Least Recently Used (LRU)” algorithm is known. The LRU is an algorithm for expelling data with the longest period of nonuse in the data held to a lower level memory when there is no longer empty space in the cache memory.
The LRU, for example, stores data illustrating the time of use for each entry of the cache memory. Each time an entry is used, it updates the data. At the time when an entry is updated, it checks the timings for all entries and judges “the least used entry”. However, the LRU takes time for performing the processing for checking the times of use for all entries. In particular, in a set associative type cache memory which divides the cache memory into “ways” and gives a plurality of tag addresses to a single index, the cache line concerned is determined by multiplication of the index and ways, so the check processing takes further time.
To simply judge unused data, the method has been proposed of judging the type of instruction supplied from the processor so as to determine data with a high frequency of access by the processor. When the instruction executed by the processor is a memory access instruction, the data fetched by the memory access instruction is managed by state information illustrating that the possibility of it being subsequently referred to is high. Further, when the result of processing by an instruction executed by the processor is registered in a cache line, the registered data is managed by state information illustrating that the possibility of it being subsequently referred to is low.
The technique is known of moving out data held in a cache memory and with a low frequency of access by a processor to a lower level memory.    [Patent Document 1] Japanese Laid-open Patent Publication No. 2004-038298    [Patent Document 2] Japanese Laid-open Patent Publication No. 2007-272681