1. Field of the Invention
The present invention relates to a shared-memory type multiprocessor system wherein a plurality of processors commonly use a single main memory device, and more particularly to a multiprocessor system wherein information representing a location of copied data within a main memory device is stored in a directory memory and data management is performed on the basis of this information.
2. Description of the Related Art
Recently, multiprocessors have been developed wherein a plurality of processors are operated in parallel to improve an operation speed, reliability and extension properties In this type of system, a plurality of processors commonly use a main memory device or the processors are coupled by a high-speed channel, etc. For example, in a tightly coupled multiprocessor, a single main memory device (hereinafter referred to as "shared-memory") is commonly used by two or more processors, and the entire system is managed by a single operating system. On the other hand, in a loosely coupled multiprocessor, there is no commonly used memory, and processors are respectively provided with exclusive memories (local memories).
When a large-scale shared-memory type multiprocessor system has several tens of processors the speed of access to the shared memory and the band width (bus width.times.bus speed) are important factors which determine the performance of the multiprocessor. For example, in the case of a bus-coupled multiprocessor having a single access path to a shared memory, a plurality of processors require use of a bus. Thus, for example, when the frequency of access to the shared memory is high, competition for the use of the bus occurs frequently and the wait time of each processor to use the bus increases, resulting in a lower performance. In general, the time for access to a shared memory is much longer than the time of processing by a processor. Consequently, the performance of the high-speed processor cannot fully be exhibited.
A method for solving the above problem has been proposed, wherein a copy of part of memory data stored in the shared memory is kept at a location for allowing high-speed access by the processor (i.e. a location near the processor). For example, according to a generally adopted method, a copy of part of memory data is stored in a local memory or a cache memory. In this method, a processor capable of accessing the shared memory without a global access path is provided with a local memory, and a copy of part of the data stored in the shared memory is held in the local memory. When a local memory is used, as compared to a large-scale shared memory, high-speed access is generally obtained. In addition, since a global access path is not used, a problem of band width is partially solved.
However, in the multiprocessor adopting the above method, since copies of data stored in the shared memory are present at a plurality of locations, coherency of data in each cache or local memory must be maintained. Various methods for maintaining such coherency have been proposed. In one of these methods, a directory memory is provided along with the shared memory, and the directory memory stores information representing which one of processing elements has a copy of data stored in the shared memory.
FIG. 1 shows an example of the structure of a multiprocessing system adopting the above method. As is shown in FIG. 1, the multiprocessing system comprises eight processing elements 1 to 8, a shared memory 9 and a coupling network 10 for coupling the elements 1 to 8 and the shared memory 9.
The processing element 1 includes a CPU 11 for controlling the entire operations of the element 1, and a cache 21 which stores a copy of part of data stored in the shared memory 9 and can be accessed at high speed by the CPU 11. The other processing elements 2 to 8 have the same structure, and include CPUs 12 to 18 and caches 21 to 28, respectively. The coupling network 10 may be of a bus type, a cross-bar switch type, or other general network types.
The shared memory 9 comprises a data memory 19, a directory memory 29, and a directory information controller 39. The data memory 19 stores various data items. The data memory 19 comprises a plurality of divided blocks, and a copy of data is stored in the caches of the processing elements in units of a block.
The directory memory 29 stores information representing which one of the processing elements has a copy of data of each block (i.e. "data block") stored in the shared memory 19. Specifically, the directory memory 29 has the same number of entries as the blocks in the data memory 19. FIG. 2 shows an example of the entry in the directory memory 29. In this example of multiprocessor system, since eight processing elements 1 to 8 are used, one entry in the directory memory 29 comprises 8 bits. Each bit of the entry corresponds to each processing element. If one or more bits of the entry have value "1", it is indicated that a copy of a data block in the data memory 19 corresponding to this entry is stored in the cache(s) of the processing element(s) associated with the bit(s) having value "1".
The directory memory 29 may include an attribute bit such as a modified bit indicating the fact that the entry has been modified.
In the multiprocessor system having the above structure, when an invalidating process for indicating that a copy of a certain data block is invalid is executed, the directory information controller 39 reads out the entry of the data block concerned from the directory memory 29. Thereby, the processing element in which a copy of the data block is present can be identified. A desired process can be performed for the identified processing element by sending a predetermined message.
In the above system, however, a directory memory having a capacity proportional to the number of processing elements is required. Accordingly, in the case where the number of processing elements increases, the number of entries in the directory memory increases accordingly. As a result, a quantitative overhead (i.e. a memory capacity occupied by an operating system and a capacity of a file employed or a ratio thereof) increases.
For example, when 256 processing elements are provided, the capacity of the entry in the directory memory for one data block must be 256 bits =32 bytes. In the case of a system wherein a data block in a data memory comprises 32 bytes, the capacity of the data memory is equal to that of the directory memory.
In the conventional directory-type multiprocessor as described above, the directory information (bits) corresponding to the number of processing elements is required. Consequently, in a system having a great number of processing elements, the capacity of the directory memory and the quantitative overhead increase.