This invention relates to a cache control for a processor, and more particularly, to a processor that has a prefetch function for pre-reading data into a cache. This invention also relates to a prefetch function of a multi-core processor including a plurality of processor cores.
In recent years, it has become possible to integrate an enormous number of transistors due to finer elements accompanying advancement of a semiconductor manufacturing technique. Along with the advancement, a clock frequency of a processor (CPU) is increased and an arithmetic processing ability thereof is remarkably improved. On the other hand, in a main memory that stores data and programs, a transfer rate and a storage capacity of data have been improved due to finer semiconductors.
However, since the data transfer rate of the main memory is low compared with the improvement of the processing ability of the processor, a processor provided with a cache memory (hereinafter, referred to as cache) on a processor core side thereof is widely used. Since the cache operates at a speed equivalent to that of the processor, the cache can perform high-speed data transfer compared with the main memory. However, the cache has a small capacity compared with the main memory because of a die size and cost of the processor. On the other hand, since the main memory depends on an operation speed of a front side bus and a memory bus connected to the processor, a data transfer rate of the main memory is far lower than that of the cache. However, the main memory can have a large capacity.
In general, when the processor core of the processor reads data, first, if the processor core accesses the cache and hits data, the processor can read necessary data from the cache at a high speed. On the other hand, when the data is not present in the cache, since the processor core fails in reading the data from the cache (cache miss), the processor core reads necessary data from the main memory.
When a cache miss occurs, it takes a long time to read necessary data into the processor core of the processor from the main memory because the data transfer rate of the main memory is extremely low as described above. Therefore, in the processor core of the processor having high arithmetic processing ability, a pipeline of the processor core is stalled until the data arrives. As a result, the arithmetic processing speed falls. When the cache miss occurs, performance of the processor cannot be fully exerted because of the low data transfer rate of the main memory. Moreover, electric power is unnecessarily wasted.
Thus, in recent years, a processor having a prefetch function for reading necessary data in a cache in advance is widely known. By pre-reading data necessary for a command to be executed into the cache according to the prefetch function, a cache miss is prevented from occurring and processing ability of the processor is exerted.
As the prefetch function of the processor of this type, there is known a function in which a prefetch command is embedded in a program (execution code), and when a processor executes the prefetch command, data of an address designated by the prefetch command is pre-read into a cache. Alternatively, there is also known a processor that determines, from a state of access to a main memory by an execution code, an address in which data is to be pre-read and performs pre-reading using hardware (see, for example, JP 2006-18474 A). The latter processor that executes prefetch using hardware detects a stride with which access is made to addresses on the main memory at predetermined intervals, determines, on the basis of the intervals of the addresses, an address into which data is to be pre-read, and executes pre-reading corresponding to the intervals of the stride.