The present invention relates to a parallel computer system of shared memory type which is used for information processors, especially personal computers (PCs), workstations (WSs), server machines, etc., and more particularly to a control method for a main memory.
In recent years, the architecture of a multiprocessor of the shared memory type (SMP) has spread to use in host models of PCs and WSs, server machines, etc. This architecture has become an important feature for the enhancement of performance in shared-memory multiprocessors that share main memory, for example among multiprocessors having a large number, such 20.about.30, processors.
Extensively used as a method of constructing a shared memory multiprocessor is a shared bus scheme. With the bus scheme, however, the throughput of the bus causes a bottleneck, and hence, the number of connectable processors is limited at most 8 or so. Accordingly, the bus scheme is not suitable as a method of connecting a large number of processors.
Conventional methods of constructing shared memory multiprocessors each having a large number of processors connected therein are broadly classified into two schemes.
One of them is crossbar switch architecture, and it is disclosed in, for example, "Evolved System Architecture" (Sun World, January 1996, pp. 29-32). With this scheme, boards each of which has a processor and a main memory, are connected by a high speed crossbar switch so as to maintain the cache coherency among the processors. This scheme has the merit that the cache coherency can be rapidly maintained.
The scheme, however, has the demerit that, since a transaction for maintaining the cache coherency is broadcast to all of the processors, traffics on crossbar switch is very high and causes a bottleneck in performance. Another demerit is that, since the high speed switch is required, a high cost is incurred. Further, since the transaction for maintaining the cache coherency must be broadcast, it is difficult to realize a system having a very large number of processors, and the number of processors is limited to ten to twenty.
In the ensuing description, this scheme shall be called the switch type SMP (Symmetrical MultiProcessor).
The other scheme provides a multiprocessor employing a directory based protocol, and it is disclosed in, for example, "The Stanford FLASH Multiprocessor" (The 21st Annual International Symposium on COMPUTER ARCHITECTURE, Apr. 18-21, 1994, Chicago, Ill., pp. 302-313). With this scheme, a directory, which is a bitmap indicative of those caches of processors to which the data line is cached, is provided for every data line of the main memory, whereby a transaction for maintaining the cache coherency among the processors is sent only to the pertinent processors. Thus, traffics on switch can be noticeably reduced, and the hardware cost of the switch can be curtailed.
Since, however, the contents of the directory placed in the main memory must be inevitably checked in submitting the transaction for maintaining cache coherency, the scheme has the demerit that an access latency is lengthened. Further, the scheme has the demerit that the cost of the memory for placing the directory increases additionally.
As stated above, the switch type SMP and the directory based protocol have both the merits and the demerits. In general, with the switch type SMP, a hardware scale becomes larger, and a scalability in the case of an increased number of processors is inferior, but a higher performance can be achieved. Accordingly, a system in which the number of PCs, server machines, etc. is not very large (up to about 30) should more advisably be realized by using the switch type SMP.
Another problem involved in constructing a shared memory multiprocessor is the problem of reliability. Each of the shared memory multiprocessors in the prior art has a single OS (Operating System) as the whole system. This method can manage all the processors in the system with the single OS, and therefore has the advantage that a flexible system operation (such as load balancing) can be achieved. In the case of connecting a large number of processors by the shared-memory multiprocessor architecture, however, this method has the disadvantage that the reliability of the system degrades.
In a server of cluster system wherein a plurality of processors are connected by a network or in MPPs (Massively Parallel Processors), individual nodes have different OSs, so that even when a system crash occurs on one node because of, for example, OS bug, the system is down only at the corresponding node. In contrast, in the case of controlling the whole shared-memory multiprocessor system by the single OS, when system crash occurs on a certain processor because of a system bug or the like, the OS itself goes down, and hence, all the other processors are affected.
A method wherein a plurality of OSs are run in the shared-memory multiprocessor for the purpose of avoiding the above problem, is disclosed in "Hive: Fault Containment for Shared-Memory Multiprocessors" (15th ACM Symposium on Operating Systems Principles, Dec. 3-6, 1995, Copper Mountain Resort, Colo., pp. 12-25).
With this method, the shared memory multiprocessor conforming to the directory based protocol is endowed with the following two facilities:
(1) The whole system is divided into a plurality of cells (partitions), and independent OSs are run in the respective partitions. The system has a single address space, and the respective OSs take charge of different address ranges. PA1 (2) A bitmap which expresses write accessible processors is provided every page of the main memory, and write access is allowed only for the processors each having a value of "1" in the bitmap. PA1 4 GB/64 B.times.16 bits=128 MB PA1 4 GB/4 KB.times.16=16 MB
More specifically, in a case where data is to be written into the main memory of each processor (in a case where the data is to be cached in compliance with a "Fetch & Invalidate" request, or in a case where a "Write Back" request has arrived), the contents of the bitmap are checked, and only the access from the processor having the value of "1" in the bitmap is allowed.
Owing to the above facility (1), even when the OS of any partition has crashed, it is possible to avoid the other partitions going down. Further, owing to the provision of the facility (2), the processor of the partition having crashed due to a bug can be prevented from destroying data which the other partitions use.
As thus far explained, the reliability of the system can be sharply enhanced by dividing the interior of the shared memory multiprocessor into the plurality of partitions.