The present invention relates to an information processing apparatus and in particular, to an information processing apparatus having a plurality of processors and a multi-hierarchical cache memory having cache coherency.
U.S. Pat. No. 5,943,684 discloses a data processing system having a first level cache and a second level cache.
Recently, the information processing apparatus has a plenty of processors for higher performance. An information processing apparatus having a plurality of processors includes a level 1 cache in each of the processors and a level 2 cache arranged in a system controller (SC) and shared by the processes, i.e., a cache memory having a more complicated hierarchical configuration.
However, in the information processing apparatus having a plurality of processors, when one of the processors has updated an address content, there should be no cache address not updated in the other processors. For this, if any such cache address exists, it should be invalidated or updated. In order to assure such a data coherence, means as will be detailed below has been used mainly.
Conventionally,,data coherence between caches is assured as follows. When a processor issues a store instruction, a store information to the cache of that processor is issued directly from the processor or the system controller (SC) so as to invalidate or update the cache of the address not updated in each of the processors. According to this method, each time a store instruction is issued, a store information should be issued to all the processors. Accordingly, as the number of the processors is increased, the use ratio of the path among the processors and the system controllers is increased by the plenty of store information issued by the processors.
In order to solve the aforementioned problems, there is also a method to provide the controller SC with a copy of a directory information of the level 1 cache in each of the processors. According to this method, since the controller SC has the directory having the address information of the caches of all the processors, when a store instruction is issued, the controller SC can issue an invalidation request only to the level 1 cache having the data of that address and thus it is possible to effectively use the bus. However, in this method, the controller SC should have directories corresponding to all the processors. Accordingly, as the number of the processors is increased, the logical amount is significantly increased.
There is also a method in which prior to issuing a store instruction, the caches of the addresses corresponding to all the other processors are invalidated and an exclusive fetch instruction is used so as to exclusively use the cache connected to the processor issuing the store instruction. In this method, at least one processor number having data of each line (data unit transferred from a cache to a memory) is recorded in the level 2 cache directory in the SC for each line, and prior to issuing a store instruction, an exclusive fetch instruction is issued, and an invalidation request is issued to the processor recorded in the directory in the SC, so that the data of that address is updated while not present in the cache of the other processor. In this method, the invalidation request need not be issued for each of the store and accordingly, it is possible to lower the use ratio of the bus among the controller SC and the processors. However, the system controller SC should have the cache information which have been contained in the respective processors. Accordingly, there is a restriction that the cache address of an upper hierarchy nearer to the processor should be completely included in the cache of a farther hierarchy. In this method, the aforementioned restriction requires that when a cache of a farther hierarchy is erased, all the caches in the nearer hierarchy should also be erased. This causes a problem that the number of cache invalidation is increased, which in turn increases the cache mismatch that a necessary data is absent in the cache.
As has been described above, the information processing apparatus having a plurality of processors and a memory hierarchy has various problems in keeping the cache coherence.
It is therefore an object of the present invention to provide an information processing apparatus having a reduced path busy ratio with a small logical amount in the controller SC without increasing the cache mismatch by use of a system controller for operating a multi-hierarchical cache memory connected to a plurality of processors.
According to the present invention, the aforementioned object can be achieved by an information processing apparatus comprising:
a plurality of clusters each including one or a plurality of processors, each said processor having a level 1 cache;
a memory accessed by one of said plurality of clusters for data stored therein; and
a system controller connected to said memory and having a level 2 cache, said level 2 cache forming a hierarchical structure together with said level 1 cache, said system controller including:
a circuit for generating an invalidation request signal indicating whether or not a line of data accessed by one of said clusters is invalidated in said level 1 cache in response to an update to the line of data in said level 2 cache accessed from said one of said clusters and depending on the type of the access and a miss or a hit resulting from the access in said level 2 cache. In one aspect of the present invention, there is provided an information processing apparatus including a plurality of processors having a store-through type level 1 cache, a system controller having a store-in type level 2 cache shared by the plurality of processors, and a main storage device connected to the system controller, wherein the level 1 cache, the level 2 cache, and a main storage device are composed as a storage device having a hierarchical configuration. The system controller has a data ownership table as a list of processor numbers having the line data and invalidation request issuing means used when a store instruction is issued from the processor, for issuing a request for invalidating the level 1 cache of the address in a processor other than the processor which has issued the store instruction, according to the contents of the data ownership table.
Moreover, the aforementioned object can be achieved as follows. When the system controller is divided into a plurality of systems which can operate independently from one another and each system (interleave) issusres an invalidation request, the invalidation request issued is put into a queue, so that invalidation requests having different destination processor numbers are merged and issued simultaneously.