Conventionally, a store-through method, also referred to as a write-through method, or a store-in method, also referred to as a write-back method, is used as a method of controlling the cache memory of a processor used as an arithmetic processing unit installed in a server used as an information processing device. These methods will be described using an exemplary structure including a main memory unit connected to a processor and a two-level cache memory including a secondary cache memory and a primary cache memory that are integrated in the processor.
When the store-through method is used in the processor, each time data is written in the secondary cache memory in the processor, the data is also written in the main memory unit. This causes access to the main memory unit, which has a slower access time than the secondary cache memory, to occur frequently. Therefore, in the processor that uses the store-through method, the writing to the secondary cache memory, which is faster than the main memory unit, may always wait for completion of the writing to the main memory unit, and this slows down the writing to the secondary cache memory.
When the store-in method is used in the processor, data is written only in the primary or secondary cache memory when a store instruction is executed and is not written in the main memory unit. Therefore, with the store-in method, when data is stored in a cache line in the secondary cache memory in which other data is present, the data registered in the cache line may be evacuated. At this timing, the processor writes the data held in the cache line into the main memory unit. In this process, the processor writes the data registered in the cache line into the main memory unit to invalidate the cache line and then registers a new cache line in the invalidated cache line. In this manner, the processor that uses the store-in method can write data written in a cache line into the main memory unit. In addition, the processor can complete writing to the secondary cache memory without waiting for the completion of writing to the main memory unit.
However, in the store-in method, “when the main memory unit is initialized” or “when data in one address in the main memory unit is copied to another address therein,” a process of writing data in a continuous area in the main memory unit is performed. An example of “copying data from one address in the main memory unit to another address therein” is illustrated in FIG. 12. FIG. 12 is a diagram illustrating the example of copying data from one address in the main memory unit to another address therein. As illustrated in FIG. 12, “to copy data from one address in the main memory unit 200 to another address therein” is, for example, to copy data A from address 0x1000 in the main memory to other addresses 0x1080, 0x1100, and 0x1180. More specifically, one example of copying data from one address in the main memory unit 200 to another address therein is, for example, to copy data in one area in main memory unit 200 to another area.
In these cases, the number of references to the main memory unit slower than the cache memories, i.e., the number of accesses to the main memory unit is lower in the store-through method than in the store-in method, and therefore high-speed processing may be achieved with the store-through method.
In the description below, the data unit used when the main memory unit is accessed is assumed to be, for example, 64 bytes. In a processor that uses the store-through method, “when the main memory unit is initialized,” 64-byte initialization data is directly written to the main memory unit to be initialized, and therefore the main memory unit is accessed one time. “When data stored in one address in the main memory unit is copied to another address therein,” the processor accesses the main memory unit to fetch 64-byte data therefrom and accesses the main memory unit to write the 64-byte data thereto. Therefore, the processor accesses the main memory unit two times.
In a processor that uses the store-in method, store data is written only to the cache memory. Therefore, before the data is written to the cache memory, the address of the main storage area to which the data is to be written may be registered in the cache memory in advance. Therefore, when “the main memory unit is initialized,” the processor that uses the store-in method accesses the main memory unit two times. More specifically, the processor that uses the store-in method accesses the main memory unit to fetch 64-byte data from the main storage area to be initialized since the address thereof may be registered in the cache memory in advance. In addition, the processor accesses the main memory unit to write the 64-byte data written in the cache memory to the main memory unit.
When “data stored in one address in the main memory unit is copied to another address therein,” the processor that uses the store-in method accesses the main memory unit three times. More specifically, the processor that uses the store-in method accesses the main memory unit to fetch 64-byte source data therefrom and also accesses the main memory unit to fetch 64-byte data from a target area that may be registered in the cache memory in advance. In addition, the processor accesses the main memory unit to write the 64-byte source data written in the cache memory to the main memory unit.
Referring to FIG. 13, a description is given of an example of “copying data stored in one address in the main memory unit to another address therein” using the store-in method. FIG. 13 is a diagram illustrating an example of copying data stored in one address in a conventional main memory unit to another address therein. In FIG. 13, the example of copying data A stored in address 0x1000 in the main memory unit 200 to address 0x1080 is described.
As illustrated in FIG. 13, the processor 300 first loads the data A from source address 0x1000 and registers the data A in address 0x1000 in a primary cache memory 310 and in address 0x1000 in a secondary cache memory 320. Next, the processor 300 executes a store instruction to write the data to target address 0x1080. More specifically, the processor 300 loads data B from address 0x1080 in the main memory unit 200 and registers data B in address 0x1080 in the primary cache memory 310 and in address 0x1080 in the secondary cache memory 320. Then the processor 300 registers data A in address 0x1080 in the primary cache memory 310 and in address 0x1080 in the secondary cache memory 320. Next, the processor 300 executes a store-in operation (write-back operation) to register data A registered in address 0x1080 in the secondary cache memory 320 in address 0x1080 in the main memory unit 200. As described above, with the store-in method, the main memory unit 200 may be accessed three times when data is copied within the main memory unit 200.
As described above, in the processor that uses the store-in method, the number of accesses to the main memory unit when the main memory unit is initialized is twice that in the store-through method, and the number of accesses to the main memory unit when data is copied is 1.5 times that in the store-through method. In addition, when a predetermined amount of data is processed, the time taken for data processing increases in proportion to the number of accesses to the main memory. Therefore, it is important to complete the data processing in a short time. More specifically, to achieve high-speed data processing, it is important to reduce the number of accesses to the main memory.
In recent years, a block store instruction, which is an instruction for writing a store block (for example, a 64-byte data block) directly into a main memory unit in one instruction, is used as a technique for reducing the number of accesses to the main memory unit in a processor using the store-in method. For example, a processor that uses the store-in method uses the block store instruction when “the main memory unit is initialized” or when “data stored in one address in the main memory unit is copied to another address therein.” When the block store instruction is executed, the processor that uses the store-in method writes data to a cache memory if the target writing area in the main memory unit has been registered in the cache memory. However, when the block store instruction is executed, the processor that uses the store-in method writes the data directly to the cache memory when the target writing area in the main memory unit is not registered in the cache memory.
The use of the block store instruction can omit one access to the main memory unit (i.e., data reading from the main memory unit to register, in the cache memory, the target writing area in the main storage area) that is executed when the store-in method is used.
Patent Document 1: Japanese Laid-open Patent Publication No. 2000-76205
Patent Document 2: Japanese Laid-open Patent Publication No. 10-301849
Patent Document 3: Japanese Laid-open Patent Publication No. 2003-29967
However, the conventional technique has a problem in that initialization of the main memory unit or copying of data form one address in the main memory unit to another address therein may not be performed at high speed in some cases even when the block store instruction is used. More specifically, in a processor that uses the store-in method, execution of the block store instruction allows high-speed processing when the data width of a cache line matches the data width of the accessed main memory unit. However, in the processor that uses the store-in method, a high-speed operation may not be achieved when the data width of the cache line is, for example, 128 bytes and the data width of the main memory is, for example, 64 byte.
For example, assuming that, in a processor including a plurality of processor cores and a plurality of primary cache memories connected to one secondary cache memory, a target cache line for the block store instruction is registered in one of the primary cache memories. Then when the data width of the cache line matches the data width of the store block, the processor instructs the primary cache memory to invalidate the cache line. Next, the processor overwrites the data in the secondary cache memory with the block store data, and the block store instruction is thereby completed.
However, when the data width of the target cache line for the block store instruction is greater than the data width of the store block, some area in the cache line may not be overwritten with the block store data. In such a case, the processor loads the data in the primary cache memory, stores this data in the secondary cache memory, and then stores the block store data in the secondary cache memory. Then the processor may for example, store, in the primary cache memory, the data in the secondary cache memory in which the block store data has been stored.
As described above, the processor that executes the block store instruction may execute a process other than the process for the normal block store instruction when the size of the cache line is greater than the size of the store block (for example, is 128 bytes). This results in difficulty in the design of, for example, the secondary cache memory and may also result in performance deterioration.