1. Field of the Invention
The present invention relates to various types of computers such as a personal computer (PC), a work station (WS), a server machine, an office computer, a minicomputer, and a mainframe, and more particularly to an information processing apparatus for data transfer via a crossbar switch in a multiprocessor configuration.
2. Description of the Related Art
A tightly coupled multiprocessor configuration which shares a main storage is prevailing among server machines and high end PC and WS. Improvements on the performance and function of a data transfer system for connecting a plurality of CPU""s, and a main storage, and a plurality of I/O devices are becoming an important issue. A configuration using crossbar switch connection is one of such data transfer system configurations. In a tightly coupled multiprocessor system, even if one CPU among a plurality of CPU""s becomes defective, the whole system runs down. In order to improve the reliability of the whole system, the whole system is multiplexed by using a hot standby configuration or the like. Multiplication of the whole system uses a general method by which a plurality of systems are prepared and used as active and standby partition. For a configuration using crossbar switch connection, a method is known by which the connection of the crossbar switch is logically divided into a plurality of groups each group running as an independent system to provide both the active and standby partition in a single system. In any of the above methods, information necessary for exchange between the active and standby partition is stored in a non-volatile external storage device such as a hard disk.
The method by which a plurality of systems are prepared and used as active and standby partition is described, for example, in JP-A-7-60399. The method by which the connection of a crossbar switch is logically divided into a plurality of groups each group running as an independent system to provide both the active and standby partition in a single system is described, for example, in xe2x80x9cTechnical White Paper: The Ultra Enterprise 10000 Serverxe2x80x9d; Sun Microsystems, Inc.; 1997 (appearing on the home page of Sun Microsystems, Inc. in the USA: http://www.sun.com/). With the above-described method of conventional techniques by which the connection of a crossbar switch is logically divided into a plurality of groups each group running as an independent system, it is necessary to reboot the whole system in order for an individual system to change settings of the division configuration.
For a so-called massively parallel type multi-processor system, a method of improving the system reliability regarding a CPU failure is provided by which a defective CPU is logically disconnected from CPU""s of a processor array and the system is dynamically reconfigured. Techniques regarding this are disclosed, for example, in U.S. Pat. No. 5,129,077.
With the conventional techniques for the above-described massively parallel multiprocessor system, a defective CPU is logically disconnected and the system is dynamically reconfigured. This method is based upon that each CPU constituting the massively parallel type multiprocessor system is provided with input/output interface compatible with the function described above. There arises therefore a problem that these techniques cannot be applied to server machines and high end PC and WS which use commercially available CPU""s not compatible with such a function.
Conventional techniques used for server machines and high end PC and WS multiplex the whole system in order to improve the system reliability. For example, if the system is doubled, the cost is at least double if the method is incorporated by which a plurality of systems are prepared and used as active and standby partition. Also with the method by which the connection of a crossbar switch is logically divided into a plurality of groups and each system is provided with both the active and standby partition, it is necessary to reboot the whole system in order for an individual system to change settings of the group division configuration. Therefore, in order to avoid a reboot during an ordinary operation of the system, the system is required to exchange the active partition with a standby partition without changing the group division configuration. It is therefore necessary for the standby partition to prepare in advance all system resources other than the system resources of the active partition. Namely, the standby partition is required to prepare additional important system resources such as CPU""s and a main storage having the same scale as those of the active partition. There arises therefore a problem that although the frame, power source and the like can be shared, the cost of the important system resources such as CPU""s and a main storage is doubled so that the cost of the whole system becomes very high.
With the above-described conventional techniques by which the connection of a crossbar switch is logically divided into a plurality of groups, the system cannot change the group division configuration during the ordinary operation of the system. Therefore, if the system is to be provided with auxiliary system resources, each group is required to independently have the auxiliary system resources. There arises therefore a problem that the cost of the auxiliary system resources becomes high.
It is an object of the present invention: to suppress an increase in cost of an information processing apparatus having a crossbar switch configuration, such as servers and high end PC and WS, wherein each system changes the division configuration of groups without rebooting the whole system, and in a hot standby system, system resources used by an active partition are included in a standby partition when the active partition is exchanged with the standby system; and to improve a system reliability to a level equal to multiplication, i.e., to a level allowing to exchange an active partition with a standby partition of a scale equal to that of the active partition when a fault occurs in the active partition, while an increase of the cost is suppressed.
It is another object of the present invention to shorten an exchange time required for each system to change from an active partition to a standby partition in a hot standby system of information processing apparatus having a crossbar switch configuration.
It is another object of the present invention to provide a system having a plurality of groups with standby system resources capable of being included in an arbitrary group.
It is another object of the present invention to provide a multiprocessor system with a crossbar switch connection capable of changing the group division configuration during an operation of the system without rebooting the whole system.
In order to achieve the above objects of the invention, in an information processing apparatus with a crossbar switch connection, when the connection of the crossbar switch is logically divided into a plurality of groups, the apparatus changes the group division configuration without affecting the logical operation other than the operation of the crossbar switch of the apparatus. Namely, the logical division is set to registers in LSI constituting the crossbar switch, and the apparatus interrupts all transfers through busy control or the like in a sync state of the whole crossbar switch to thereby make the operation other than that of the crossbar switch stand and to change the setting in the registers of LSI during the interruption.
More specifically, according to the present invention, the apparatus has two sets of registers for setting the group configuration of logical division, and setting values of one of the two sets of registers are always used, and the values of the other set are neglected. The apparatus also has a change instruction register for instructing a change in the group configuration of logical division, and the apparatus changes the group configuration of logical division by selecting the setting values in one or the other of the two sets of registers, in the manner described below.
First, when data is written in the change instruction register, the crossbar switch notifies each port of a busy state to thereby instructs each port to interrupt a transfer and stand by. Each port therefore judges that the crossbar switch is busy and interrupts the transfer. The crossbar switch terminates all the transfers under processing, and synchronizes the whole crossbar switch. After this sync operation is completed, the crossbar switch selects the setting values in one or the other of the two sets of registers. In this manner, the division configuration of groups is changed. After the registers are selected, the crossbar switch instructs each port to allow to release the busy state and resume the transfer. Each port therefore judges that the crossbar switch is not busy, and resumes each transfer.
The above procedure of the invention is applied to a hot standby system having active and standby partition. When an active partition is exchanged with a standby partition because of a fault of the active partition, the system includes system resources such as CPU""s and memories in the standby partition. More specifically, when an active partition is exchanged with a standby partition because of a fault of the active partition, the system resets the defective partition and changes the division configuration of groups while the system is in an ordinary operation to combine the active and standby partition in one group and include the reset system resources such as CPU""s and memories used by the active partition in the standby partition. A scale of the standby system can therefore be expanded to a scale necessary for the operation of the standby system.
Further, according to the invention, a hot standby system having active and standby partition has a main storage shared by the active and standby partition, and information necessary for exchange between the partition is stored in the main storage.
Still further, according to the invention, since the system can change the division configuration of groups while the system in an ordinary operation, the system can be provided with standby system resources not belonging to any group, and can include the standby system resources in an arbitrary group as the division configuration of groups is changed when necessary.