1. Field of the Invention
The present invention relates to a cache memory control unit having a primary cache and a secondary cache, a cache memory control method, a central processing unit, an information processor, and a central processing method.
2. Description of the Related Art
FIG. 6 is a block diagram showing a configuration example of a conventional multiprocessor system. The conventional multiprocessor system includes a main memory 1, an SC (System Controller) 2, and one or more CPUs (Central Processing Unit) 103. The SC 2 is a main memory controller connected to the main memory 1 and all CPUs 103. The CPU 103 internally includes an L1 cache 4, an L2 cache 105, and a calculation unit 6. The L1 cache 4 is a primary cache of the CPU 103, and the L2 cache 105 is a secondary cache thereof. The L1 cache 4 includes an I1 cache 11 as an instruction cache and a D1 cache 12 as a data (operand) cache. The calculation unit 6 performs calculation using data of L1 cache 4.
As a method for hiding latency in CPU memory access, a prefetch operation has been widely used. With this method, the area in a main memory that is likely to be used is previously “moved-in” (data of the main memory is previously registered in a cache), thereby reducing cache miss rate.
An L2 cache request (request to the L2 cache 105 issued from the L1 cache 4) includes a demand fetch request which is a normal readout operation from the main memory 1 and a prefetch request which is a speculative readout operation. The demand fetch request performs registration of data both in the L2 cache 105 and the L1 cache 4 which is a request source. On the other hand, the prefetch request performs data registration only in the L2 cache 105.
Next, a description will be given of the types of the L2 cache requests. FIG. 7 is a table showing an example of L2 cache requests. Here, a cache block of the I1 cache 11 is managed by I1 TC (Type Cord) that assumes two states: V (Valid) and I (Invalid). A cache block of the D1 cache 12 is managed by D1 TC (Type Cord) that assumes three states: M (Modified), C (Clean), and I (Invalid). A cache block of the L2 cache 105 is managed by L2 TC (Type Cord) that assumes five states: M (Modified), O (Ownered), E (Exclusive), S (Shared), and I (Invalid).
Firstly, L2 cache requests for the demand fetch request: IF-MI-SH, OP-MI-SH, and OP-MI-CH will be explained.
The L1 cache 11 issues IF-MI-SH (L2 cache request for requesting a shared cache block) when an instruction fetch miss has occurred.
The D1 cache 12 issues OP-MI-SH (L2 cache request for requesting a shared cache block) when an operand load miss has occurred. Further, the D1 cache 12 issues OP-MI-SH (L2 cache request for requesting a shared cache block) when an operand store miss has occurred. Further, the D1 cache 12 issues OP-MI-BL (L2 cache request called a block load that does not involve cache registration) for an operand load.
In order to increase L2 cache hit rate at the time of L1 cache miss, L2 cache requests for the prefetch request: IF-PF-SH, OP-PF-SH, and OP-PF-EX are prepared in correspondence with the aforementioned L2 cache requests for the demand fetch request: IF-MI-SH, OP-MI-SH, and OP-MI-CH, respectively.
The L1 cache 4 does not involve at all the prefetch request that the L1 cache 4 itself has issued. The L2 cache 105 can discard a prefetch request when it is difficult to perform prefetch processing, for example, in the case where there are many requests that have not been processed.
An operation between the L2 cache 105 and SC 2 will next be described. The L2 cache 105 sends reply data to the L1 cache 4 when the L2 cache request is a cache hit. On the other hand, the L2 cache data 105 issues a move-in request P-Req to the SC 2 when the L2 cache request is a cache miss, and receives data S-Reply as a reply of the P-Req. The data S-Reply is then sent to the L1 cache 4 and is registered also in the L2 cache 105 at the same time.
When P-Req from the L2 cache 105 to the SC 2 is P-RDSA (robust shared type), S-RBS (shared type) is returned as S-Reply from the SC 2 to the L2 cache 105. When P-Req is P-RDS (shared type), S-RBS (shared type) or S-RBU (exclusive type) is returned as S-Reply. When P-Req is P-RDO (exclusive type), S-RBU is returned as S-Reply.
When L2 cache request is OP-MI-CH, registration is performed in M (modified) state even when a reply cache block is a clean one (write back to the memory is not necessary).
When L2 cache request OP-MI-BL results in L2 cache miss, P-RDD (discard type request that does not involve invalidation of other cache blocks or data-sharing) is issued from the L2 cache 105 to the SC 2.
A configuration of the L2 cache 105 will next be described. FIG. 8 is a block diagram showing a configuration example of the conventional L2 cache. The L2 cache 105 is constituted by an L2 cache controller 120 and a data RAM 29. The L2 cache controller 120 includes an MI-PORT 21, a PF-PORT 22, other ports 23, 24, a priority section 25, a tag processor 26, a processing pipeline 127, and an MIB (Move-in Buffer) 128.
L2 cache request to the L2 cache 105 is firstly received by ports 21, 22, 23, and 24 that are associated with the L2 cache requests. The priority section 25 fetches the L2 cache requests remaining in the ports 21, 22, 23, and 24 and feeds the fetched requests into the processing pipeline 127. The processing pipeline 127 performs tag-search and tag-update operations in the tag processor 26, issues a reply to the L1 cache in response to the L2 cache request, ensures the MIB 128, and the like. The MIB 128 is a buffer for receiving cache miss data from the SC 2 and is ensured for each move-in processing.
The MIB 128 is ensured at the cache miss time, then holds tag information, and is released upon completion of the corresponding move-in processing. The tag information includes a move-in request source, a move-in request address, a write back address to be replaced, and the like. The MIB 128 has a section that notifies the L1 cache 4 of move-in data arrival from the SC 2 and an abnormal end of the process for the move-in request address. Examples of a notification signal to the L1 cache 4 include a data valid signal, an error notification signal, and the like. The data RAM 29 stores data from the SC 2. The data from the SC 2, which has been obtained due to an L2 cache hit, is sent to the L1 cache 4. The data from the SC 2, which has been obtained due to an L2 cache miss is sent to the L1 cache 4 and at the same time, registered in the data RAM 29.
A normal demand fetch operation will next be described. FIG. 9 is a time chart showing an example of an operation of the L2 cache for a normal demand fetch request. The demand fetch request issued from the L1 cache 4 is stored in the MI-PORT 21, fetched in the priority section 25 and fed to the processing pipeline 127. Since the result obtained by performing a tag search with a read flow is cache miss in the case of FIG. 9, the MIB 128 is ensured and a move-in request is issued to the SC 2. Move-in data from the SC 2 is immediately sent to the L1 cache 4 and at the same time is registered in the L2 cache 105 as a rule.
A normal prefetch operation will next be described. FIG. 10 is a time chart showing an example of an operation of the L2 cache for a normal prefetch request. The prefetch request issued from the L1 cache 4 is stored in the PF-PORT 22, fetched in the priority section 25 and fed to the processing pipeline 127. Since the result of a tag search is cache miss in the case of FIG. 10, the MIB 128 is ensured and a move-in request is issued to the SC 2. Move-in data from the SC 2 is registered only in the L2 cache 105, and the operation ends.
As a reference of the related art, Japanese Patent Application Laid-Open No. 2-133842 (pages 4 to 6, FIG. 2) is known.
In the aforementioned conventional L2 cache, when an MIB being in a move-in stand-by state and having the same memory address as that of a request that has been fed to the processing pipeline 127 exists, processing of the request in the processing pipeline 127 is disrupted and the request is allowed to remain in the port so that a move-in request for the same address is not issued during the time between an issuance of the move-in request to the SC 2 and arrival of a reply from the SC 2.
FIG. 11 is a time chart showing an example of an operation of the conventional L2 cache in the case where a demand fetch request is issued immediately after issuance of a prefetch request. The prefetch and demand fetch requests are made for an address on the same cache block. The preceding prefetch request is fetched from the PF-PORT 22 and fed to the processing pipeline 127. Since the result of a tag search is cache miss in the case of FIG. 11, the MIB is ensured and a request is issued to the SC 2. The succeeding demand fetch request is fetched from the MI-PORT 21. In this case, the request address is in a waiting state and processing does not proceed, with the result that the demand fetch request remains in the MI-PORT 21. After that, a move-in data as a reply to the preceding prefetch request arrives, a cache block is registered in a tag, and the MIB 128 is released. After completion of the processing of the preceding prefetch request, a cache hit of the succeeding demand fetch request starts processing of the demand fetch.
That is, although the demand fetch request has been issued from the L1 cache 4 and the move-in data has arrived from the SC 2, data cannot be directly passed to the L1 cache 4 since the move-in data is a reply to the prefetch request, with the result that the data is registered only in the L2 cache 105. When the demand fetch request is fed to the processing pipeline after completion of the move-in to the prefetch request, the data that has been registered in the L2 cache 105 is sent to the L1 cache 4. Therefore, waiting time for the prefetch processing to be completed is generated in the demand fetch processing.