A compiling device, for example, performs a compilation process for converting a source program into a machine language program that is executable by an arithmetic processing device such as a processor and the like. The arithmetic processing device, for example, includes a decoding unit, an arithmetic unit, and a cache memory. The decoding unit decodes instructions included in the machine language program generated through the compilation process. The arithmetic unit performs arithmetic operations based on the decoded instructions. The cache memory is arranged between the arithmetic unit and a main memory that is a main storage device. The arithmetic unit, for example, performs arithmetic operations by referring to data stored in the main memory or the cache memory. The cache memory, for example, stores data that is referred to by the arithmetic unit.
The arithmetic processing device, for example, may shorten a waiting time for reference of data by referring to data stored in the cache memory when compared with referring to data stored in the main memory. A hit rate of the cache memory decreases in a numerical calculation process that uses large-scale data such as an array and the like because data locality is low. In this case, the effect of shortening the waiting time for reference of data is small since the cache memory is not effectively used.
As a measure for remedying a decrease in the hit rate of the cache memory, for example, a prefetch is used to transfer data stored in the main memory to the cache memory in advance. Examples of a method for realizing the prefetch include a software prefetch realized by software and a hardware prefetch realized by hardware.
In the software prefetch, for example, the compiling device inserts an instruction (also referred to as a prefetch instruction hereinafter) to transfer data stored in the main memory to the cache memory in advance into the machine language program. In the hardware prefetch, hardware such as a hardware prefetch mechanism and the like is disposed inside the arithmetic processing device. For example, when it is determined that an access to a continuous memory is performed, the hardware prefetch mechanism predicts data that is accessed next and transfers the data stored in the main memory to the cache memory in advance.
In an arithmetic processing device that includes the hardware prefetch mechanism, for example, performance of the arithmetic processing device may be degraded even though the software prefetch is applied. For example, there may be a case where both of a prefetch by the hardware prefetch mechanism and the prefetch by a prefetch instruction are performed for data at the same address. That is to say, an unnecessary prefetch instruction may be inserted into the machine language program. In this case, for example, performance degradation such as an increase in the number of instructions, a decrease in the process efficiency of scheduling, a decrease in the transfer speed due to an increase in the amount of consumption of a bus, and the like may be caused.
For this reason, a technology has been proposed to efficiently achieve a balance between the hardware prefetch and the software prefetch and to improve the performance of the arithmetic processing device. For example, a memory access instruction added with indicative information, which indicates whether the memory access instruction is a target of the hardware prefetch, is used in this type of an arithmetic processing device. In the compilation process, for example, a memory access instruction added with the indicative information is generated when memory access instructions for accessing a continuous memory are detected.
For this reason, the arithmetic processing device, for example, includes a decoding unit capable of decoding a memory access instruction added with indicative information and an arithmetic unit capable of executing the memory access instruction added with the indicative information. Furthermore, the arithmetic processing device includes a hardware prefetch mechanism compliant with the memory access instruction added with the indicative information. For example, the hardware prefetch is suppressed when the indicative information indicates that the memory access instruction is not a target of the hardware prefetch.
Related techniques are disclosed in, for example, Japanese Laid-open Patent Publication No. 2009-230374, Japanese National Publication of International Patent Application No. 2011-504274, Japanese Laid-open Patent Publication No. 2010-244204, Japanese Laid-open Patent Publication No. 2006-330813, Japanese Laid-open Patent Publication No. 2011-81836, Japanese Laid-open Patent Publication No. 2002-297379, and Japanese Laid-open Patent Publication No. 2001-166989.
When a software prefetch is applied to an arithmetic processing device that includes a hardware prefetch mechanism, for example, an unnecessary prefetch instruction may be inserted into the machine language program. This may cause degradation of performance of the arithmetic processing device. The compiling method that generates a memory access instruction added with indicative information, which indicates whether the memory access instruction is a target of the hardware prefetch, has a less versatility. For example, in the compiling method that generates a memory access instruction added with the indicative information, the performance of the arithmetic processing device may be degraded when the decoding unit, the arithmetic unit, and the hardware prefetch mechanism are not compliant with the memory access instruction added with the indicative information.