An information processing apparatus, such as a server, that uses a technology of NUMA (Non-Uniform Memory Access) has conventionally been known. In an information processing apparatus using the technology of NUMA, for example, a main memory as a main storage is connected to each of a plurality of central processing units (hereinafter abbreviated to CPU (Central Processing Unit)), and each main memory is shared by the plurality of CPUs.
In an information processing apparatus using this NUMA technology, each CPU controls cache coherence so as to maintain the consistency between cache memories built into their respective CPUs in a directory-based system, for example. In the directory-based system, each CPU manages a directory state indicating the state of a directory indicating the location of each data block for each data block on local memory being main memory connected to itself.
For example, directory states indicating the states of a directory include INV (Invalid), SH (Shared), and EX (Exclusive). Here, INV indicates not to be held in a cache memory included in any other CPU, SH to be held in cache memories of other CPUs in a clean state, and EX to be held in a cache memory of a single CPU and the possibility of being dirty.
If there is a cache miss in a certain CPU, data is requested of a CPU connected to a main memory owning data where the cache miss has occurred. In the following description, a CPU of a data requestor is described as L(Local)-CPU, a CPU connected to a main memory owning the data requested due to the occurrence of the cache miss is described as H(Home)-CPU, and a main memory is simply described as “memory.”
If the data of an address requested by the L-CPU is not held in a cache memory of any other processor, the H-CPU reads the data from a memory connected to itself to transfer to the requester. Moreover, concurrently with this process, the H-CPU stores a directory state indicating that a cache line of this address is held in the L-CPU in the memory connected to itself.
If an invalidation process occurs to the data brought out in a shared type in a cache protocol that manages only the above directory states, the H-CPU broadcasts an invalidation request to cache memories of all the CPUs in a system. In such a case, unnecessary data communication occurs; accordingly, the amount of communication increases.
From this point, the H-CPU may manage a presence bit indicating the location of another CPU, the presence bit being held in a cache memory that contains data brought out in a shared type other than the directory state, in other words, data shared with the CPU. If an invalidation process occurs to a certain data block, the H-CPU issues an invalidation request only to a CPU holding the data shared with the CPU on the cache memory, using the presence bit.
Patent Document 1: Japanese Laid-open Patent Publication No. 2002-304328
However, the above known technology has a problem that the amount of directory information increases with increase in the number of central processing units.
If the number of CPU nodes is increased, for example, to 128 based on recent years' performance requirements for a central processing unit, the number of bits corresponding to the number of CPU nodes is used for presence bits; accordingly, the presence bits become 128 bits and the amount of directory information increases.