The present invention relates to a control method and apparatus for switching between central processing units in an information processing system, and more particularly to a control method and apparatus capable of switching a session from one central processing unit to another without influencing the operation of terminal devices in an information processing system constituted by a network control system.
Recently, there is an information processing system constituted by a network control system in which a communication environment is set without stopping the whole information processing system, as is disclosed, for example, in Japanese Patent Unexamined Publication No. 61-139859.
When there occur additions to and changes in the communication environment, the technique disclosed in the publication is intended to stop only a communication managing program in the information processing system to cope with the additions and changes without stopping the whole information processing system. Namely, the technique is intended to perform system formation partially, but it does not disclose switching from one central processing unit to another by a communication control processor which is connected to those central processing units in a network control system configuration.
As an information processing system configuration has become large-scaled and complicated, a process has become prevalent in which a group of central processing units copes with newly occurring demands for calculations and processing operations. Such an information processing system of the type which has a network system configuration including a plurality of central processing units, a plurality of communication control processors, and a plurality of terminal devices, as mentioned above, must be operated efficiently by greatly saving its consumed power (power saving operation). More specifically, in midnight time zones and/or holidays in which the load on the information processing system will decrease, it is desirable to stop several of the plurality of central processing units and to further restart the remaining central processing units when the load increases in order to cope with the increased demand for processing operations. However, the service provided by the information processing system to the users who use the information processing system by means of terminal devices in a time sharing system (TSS) and/or on an on-line basis must not be deteriorated. More specifically, when users use in TSS a first central processing unit, they must not be required to perform any operations to issue a command to log temporarily from a first central processing unit (LOGOFF command) and a command (LOGON command) to log on a second central processing unit (LOGON command). There is no problem if the terminals users can enjoy the services provided by the second central processing unit instead of the first central processing unit without requiring the terminal users to perform those operations. However, such control system has not been proposed in the past.
Recently, to improve the reliability of the system, information processing systems are constructed to have redundancies such as spare devices, spare central processing units and a control system which performs switching from main devices to these spare devices without causing erroneous operations, as is disclosed, for example, in Japanese Patent Unexamined Publication No. 62-17258.
The publication discloses the techniques in which spare devices are prepared at all times in the information processing system. When there occurs a failure in one device, a switching device switches information processing from the defective device to another spare one, and a testing device is allowed to diagnose the defective device. The defective device can be repaired as it is connected to the testing device. However, there are various problems in expanding the devices to be switched to central processing units. The JP-A 62-17258 does not refer to a control system for switching and reexecution of a group of programs which has run at the central processing unit in which a failure occurred to another processing unit. Nor does the publication disclose a control unit which detects the occurrence of failures.
When a failure occurs in a central processing unit in an on-line processing system, general well-known methods of restoring the central processing unit are as follows:
(a) A common use of a group of files between a spare central processing unit and the main central processing unit; and
(b) Picking up, in a defective central processing unit, of histories of messages from the respective terminals belonging to the defective central processing unit, and sequential processing of the message histories after restoration of the defective central processing unit.
In the method (a), the operator of the information processing system detects a failure in a central processing unit, and switches the defective central processing unit to a spare one in accordance with the instructions of the operation manager. However, during the time when the main central processing unit operates normally, the spare one is on standby, which is useless. In the method (b), since the state before the failure occurs is restored on the basis of the message histories picked up in the central processing unit, the contents of the files must also be returned to their initial state, and the users at the terminal devices must wait during that restoring operation.
The service provided by the information processing system to the users who use the information processing system through their terminal devices in a time sharing system or on an on-line basis must not be deteriorated. More specifically, when there occurs a failure in any one of a plurality of central processing units, it is desirable to automatically switch the terminal users belonging to the defective central processing unit to another normal one without requiring the terminal users to perform any operations required for the use of the normal processing unit. However, the conventional control systems have not realized such switching system.