In the field of a computer system, a cache memory technique is widely used to speed up data access by a processor. The cache memory is a memory that is smaller in scale but higher in speed than a main memory, and provided between the processor and the main memory. The cache memory may be hierarchically provided; however, in the following, for simplicity, it is supposed to use one cache memory, and a main memory as a lower layer of hierarchical structure. Even if a plurality of cache memories are hierarchically provided, the same discussion can be applied.
In the cache memory, a part of data having stored in the main memory is stored. More specifically, in the cache memory, data is stored in a unit referred to as a “line (or block)”. That is, the cache memory has a plurality of lines, and data for a line size is stored in each of the lines. The line size is a data size per one line, and is 32 bytes, for example. Upon transfer of data having been stored in the main memory to the cache memory, block data for the line size, including the data, is copied from the main memory into a line of the cache memory.
When the processor issues a data access instruction, the cache memory examines whether or not data as an access target is stored in any of the lines. A case where the data as the access target is stored in one of the lines is hereinafter referred to as a “cache hit”. On the other hand, a case where the data as the access target is not stored in any of the lines is hereinafter referred to as a “cache miss”.
Processing upon data read is as follows: In the case of the cache miss, data is read from the main memory, and then sent to the processor. Also, block data for the line size including the data is copied into a line of the cache memory. On the other hand, in the case of the cache hit, data is read from a corresponding line of the cache memory, and then sent to the processor. That is, no access to the main memory occurs, and the data is read from the higher speed cache memory. Accordingly, a data reading speed is improved.
Regarding processing upon data write, various systems are proposed as illustrated in FIG. 1. The data write system is roughly classified into two systems, i.e., a “write-through system” and a “write-back system”. The write-back system is further classified into a “non-write allocate system” and a “write allocate system”.
Processing in the write-through system is as follows: In the case of the cache miss, write data is not written in the cache memory, but written only in the main memory. On the other hand, in the case of the cache hit, the write data is written in a corresponding line of the cache memory, and also in the maim memory. Accordingly, in the case of the write-through system, benefit from the cache memory can be gained only upon data read.
Processing in the write-back system is as follows: In the case of the cache hit, write data is not written in the main memory, but written only in a corresponding line of the cache memory. Accordingly, a data writing speed is improved. It should be noted that the latest data that is stored only in the cache memory but is not reflected in the maim memory is written back into the main memory at some time. In the case of the cache miss, processing is different between the non-write allocate system and the write allocate system. In a case of the non-write allocate system, similarly to the write-through system, the write data is not written in the cache memory but written only in the main memory. On the other hand, in the write allocate system, block data for a line size including data as an access target and is read from the main memory. The read block data is stored in some line of the cache memory, and then the write data is written in the line. As described, in the case of the write allocate system, the block data should be transferred from the main memory to the cache memory, and therefore it takes longer time than in the non-write allocate system. However, from the view of locality (having a tendency of continuously accessing successive addresses in the main memory, or repeatedly accessing the same address in a short period of time), it is expected that a probability of the cache hit upon a subsequent request to write data is increased by the write allocate system.
As described, the write allocate system and non-write allocate system respectively have both advantages and disadvantages. To determine which one of the write allocate and non-write allocate systems is employed is important from the view of a system processing efficiency.
According to Japanese Patent Application Publication (JP-a-Heisei 11-312123: first conventional example), a user can assign any one of the write allocate system and a non-write allocate system. Specifically, a cache controller has a register in which information assigning the write allocate system or the non-write allocate system is stored. The user can rewrite content of the register to thereby assign a system.
A cache control unit described in Japanese Patent Application Publication No. (JP-A-Heisei 7-152650: second conventional example) includes a cache memory of a write-back system, a register, a comparator, and a control circuit. When a cache miss occurs upon request to write data, block data including the data is stored in a block address of the register. Upon request to write subsequent data, the comparator compares the block address of the resister and a block address to be currently accessed. The control circuit determines, on the basis of a result of the comparison by the comparator, a processing method upon the cache miss. Specifically, in a case that the comparison result indicates a cache hit, the control circuit performs processing in the write allocate system. On the other hand, in a case that the comparison result indicates a cache miss, the control circuit performs the processing in the non-write allocate system, and also updates the register. That is, upon the request to write data in a block in the memory, the processing is first performed in the non-write allocate system. If a subsequent request to write data is a request to the block, corresponding block data in the memory is copied into the cache memory in the write allocate system for the first. A request to write data in the block is expected to continue, and therefore upon a third or subsequent request to write data, the cache hit is expected.
Japanese Patent Application Publication (JP-A-Heisei 7-210463: third conventional example) discloses a cache memory system including a first cache memory and a second cache memory. Upon a cache miss in the first cache memory for a store (write) instruction, whether or not a block transfer is performed from the second cache memory to the first cache memory according to the write allocate system depends on the situation. To determine whether or not the block transfer (write allocate) is performed, a determining section is provided. The determining section prohibits the block transfer only for an operation of continuously rewriting the whole cache data included in a single line, and for the rest, permits the block transfer. As an example, it is assumed that one line includes four cache data. An instruction buffer register includes an instruction prefetch queue in four stages in series. The determining section receives in parallel data retained by stages of the serial four-stage instruction prefetch queue. Further, the determining section detects whether each of the instructions corresponds to “store (write)” or “load (read)”, and also detects whether or not objects to be accessed on the basis of the respective instructions are the same block. Then, in a case of the continuous store instructions that all of cache data in the single line are continuously rewritten, the determining section prohibits the write allocate. For example, in a case of “four continuous store instructions” as illustrated in FIG. 10 of the third conventional example, the determining section prohibits the write allocate. On the other hand, in a case of “1-store-3-load instructions” in which the number of times of storage is only one, the determining section permits the write allocate.
In a cache memory of the write-back system, it is important from the view of system processing efficiency to determine which one of the write allocate system and the non-write allocate system is used.
In the above-described conventional examples, it is necessary to detect continuous instructions to write data in the same block, or an operation of continuously rewriting all of cache data included in a single line. However, in a case of a scalar processor, it is generally difficult to predict an address to be accessed after data write request. Accordingly, to detect the continuous instructions to write data in a same block, or the operation of continuously rewriting all of cache data included in a single line, a complicated configuration and processing as described in the above conventional examples are required.