Generally, in an information processing system having a control processing unit such as a CPU (Central Processing Unit) as a processor, the data frequently used is stored in a cache memory separate from a main memory to improve the processing speed of the CPU.
The cache memory, though smaller in capacity than the main memory, can be accessed at a higher speed. The processing speed can be improved, therefore, with a cache memory built in a CPU having an arithmetic processing unit while the data from the arithmetic processing unit which is frequently used is replaced from time to time and stored in the cache memory.
Also, in order to further improve the processing speed, a plurality of cache memories are hierarchically arranged, and at the time of processing, the arithmetic processing unit of the CPU first accesses the primary cache (hereinafter referred to as “the L1 cache”) accessible at the highest speed.
In the absence of the required data in the L1 cache (hereinafter referred to as “the L1 cache miss”), a demand request (hereinafter referred to as “the DM”) is issued to the secondary cache (hereinafter referred to as “the L2 cache”) to access the data involved in the L1 cache miss.
In order to improve the processing performance of a CPU, a plurality of CPU processor cores as processing units (hereinafter referred to as “the CPU cores”) may be mounted on a single CPU providing an arithmetic unit.
In recent years, the use of an on-chip multicore processor has been extended as a multicore processor having a plurality of CPU cores mounted on one chip.
An on-chip multicore processor is generally so configured that an L1 cache is arranged in each CPU core and one L2 cache is shared by a plurality of CPU cores.
In this configuration with one L2 cache shared by a plurality of CPU cores, however, many accesses are liable to be concentrated on the L2 cache.
In view of this, a cache memory control device as disclosed in Japanese Laid-open Patent Publication No. 2006-40090, for example, has been proposed in which the accesses to the L2 cache are divided into the DM and a prefetch request (hereinafter referred to as “PF”) to permit the CPU cores to predict and read the required data in advance, and in which the L2 cache is accessed through a dedicated port (MIPORT (Move In PORT) for DM and PFPORT (Pre Fetch Port) for PF) for each request in each CPU core.
According to the conventional technique represented by the aforementioned Japanese Laid-open Patent Publication No. 2006-40090 which is so packaged that the substantially whole PFs to the L2 cache are processed, however, the PF having failed to be completed for lack of resources, for example, is recharged into the PFPORT.
When vacant PFPORT entries are lacking, however, a waiting period is required before an entry becomes available, resulting in a lower throughput of the PFs as a whole.