An arithmetic processing unit or device (a CPU (Central Processing Unit) chip or a microprocessor, hereinafter simply referred to as a processor) includes caches at a plurality of levels in order to improve memory access performance. The processor includes a plurality of processing units, cores, or circuits (CPU cores, hereinafter referred to as cores), and the core occupies and uses a cache at a first level (L1 cache) as a private cache. Further, the processor includes an upper level cache shared by a plurality of the cores.
In addition, the processor includes, among the caches, a plurality of or a single cache at a level closest to a main memory (last level cache: LLC, hereinafter referred to as an LLC), and there are cases where the processor further includes, outside the LLC, coherency control units or circuits for maintaining coherency between the caches.
Each of the LLC and the coherency control unit includes a control pipeline circuit to which a request is entered, and a request process unit or circuit unit that executes a process corresponding to the request. The request process unit or circuit unit includes a miss access control unit that processes access to a memory in the case where the inputted (entered) request results in cache miss.
On the other hand, the control pipeline circuit performs a tag determination on whether or not the address of the inputted request results in cache hit, a determination on whether or not the address of the inputted request conflicts with the address of a request that is being processed, a determination regarding content of the process for the request, and a resource determination on whether or not it is possible to acquire circuit resources of the process circuit unit, and requests the miss access control unit to perform a memory request process for an appropriate request. Consequently, the request for the memory request to the miss access control unit for the appropriate request is performed after the determination processes in the control pipeline circuit, and hence there are cases where the start of the process of the memory request is delayed, and latency is increased.
To cope with this, when a request is inputted to the control pipeline circuit, it is conceivable to reduce the latency by performing a speculative memory request for the request, before the tag determination. The speculative memory request is disclosed in Japanese Laid-open Patent Publication No. 2006-53857.