1. Field of the Invention
The present invention relates to a multiprocessor system of a distributed shared memory structure having a plurality of nodes each comprising at least one processor and a main memory.
2. Description of the Related Art
To meet availability requirements, some information processing apparatus have a hot plug function to allow a faulty component to be replaced with a normal component without an apparatus shutdown when the information processing apparatus suffer a failure.
Some conventional devices including processors and I/O units are repairable on-line even when they are separated from the active system during system operation if the system can continuously be operated by backup devices and can relatively easily be separated or connected under the control of the operating system. Depending on the operating system, however, a fixed memory space is used in the main memory while the system is in operation, and the main memory is not connectable and disconnectable as a hot plug main memory as the fixed memory space cannot be separated.
In order to increase main memory availability, there have employed an apparatus having a duplex main memory configuration that allows a backup main memory to be used in the event of a failure of an active main memory, and an apparatus with a multiplex system structure. However, these apparatus are expensive because of the redundant function due to the duplex or multiplex scheme.
There has recently been proposed a multiprocessor system with a distributed shared memory configuration for increased performance and extendability. The multiprocessor often has a processor and a main memory that are installed on one printed-circuit board. The physical mounting structure of the multiprocessor poses a limitation on the use of a hot plug FRU (Field Replaceable Unit), and the printed-circuit board serves as one replaceable unit. Therefore, when the processor on the printed-circuit board suffers a fault and is to be replaced with a normal processor by a hot plug function, the main memory on the same printed-circuit board also needs to be replaced. Therefore, the main memory is required to have a hot plug function.
It is therefore an object of the present invention to provide a multiprocessor system of a distributed shared memory structure having a hot plug function for main memories.
For the purpose of separating a node from a multiprocessor system of a distributed shared memory structure during operation of the multiprocessor system, the memory space of a main memory managed by the node to be separated dynamically switches to the memory space of a main memory of a backup node without software recognition.
If a memory write access request is issued from a processor, an IO unit, or a mover to the main memory of a master node, then the memory write access request is transferred to the master node and a slave node when in a multicasting mode (for the transfer of the memory write access request to a plurality of nodes), and instructs only the master node to transfer a memory read access request when not in the multicasting mode. Therefore, when in the multicasting mode, a memory write process carried out on the main memory of the master node is also effected on the main memory of the slave node.
By copying data from the memory space of a node to be separated to the memory space of a backup node without a system shutdown, the node can be separated without stopping the operation of the system. Consequently, it is possible to repair a faulty area of the system without a system shutdown. As a result, the multiprocessor system can provide an information processing system of high availability.
A hardware-implemented mover function copies data from the memory space of the node to be separated to the memory space of the backup node, during which time a conflict with a memory access request issued from the processor or the IO unit is solved by hardware. When copying data from the memory space of the node to be separated to the memory space of the backup node without a system shutdown, the memory space from which data is to be copied can be accessed without the operating system recognizing the copying of the memory data. Therefore, the multiprocessor system can provide an information processing system of high availability regardless of the operating system.
Since all memory data is copied from the node to be separated to the memory space of the backup node upon separation of the node, the backup node is required only when copying the memory data, and a redundant node is not required to be available at all times. The cost of the multiprocessor system is relatively low as such a redundant node is not necessary during normal operation thereof.
Resources to be locked for use in ensuring inseparable operation of a memory read access request and a memory write access request to memories upon copying of the memory data are not all memory areas, but only memory addresses which need inseparable operation. Since memory addresses can individually be locked by a lock address buffer, the frequency of conflicts with memory access requests from processors and IO units is lowered. In this manner, the memory data can be copied without involving a reduction in the system capabilities during operation of the system.