1. Field of the Invention
This invention relates to data processing systems, and more particularly to real and virtual service machines that are operated automatically in response to operations of a real or virtual machine operating system.
2. Description of Related Art
U.S. Pat. No. 4,453,210 of Suzuki et al for a "Multiprocessor Information Processing System Having Fault Detection Function Based on Periodic Supervision of Updated Fault Supervising Codes" states as follows:
"A counter is provided for each of a plurality of processors for holding an associated fault supervising code. The code stored in the associated counter is periodically updated by the associated processor while the update status of the code is supervised on a cycle longer than the cycle of the updating period. If a fault occurs in one of the processors, the fault supervising code corresponding to that processor will not be updated. Thus, the faulty processor can be detected by periodically supervising the update status of the fault supervising code. The supervising operation can be carried out by software or hardware."
It also requires that a number of processors be connected to use a common bus, to share a common load, and to share a common memory.
A DISCONNECTED-but-running service machine (SERVER) in a CP (Central Processor) loaded with CMS (Conversational Monitor System): a VM-based software system of IBM) is an account on the CP system which has a USERID (user identification of an account on the CP system) in a Virtual Machine (VM) Operating System (OS) environment that operates unattended and disconnected from any terminal (normally) in the background of the VM system and will respond to transactions, files or messages (interrupts) that have been sent to it. It will also ACTIVATE, i.e. "wake up", at certain specified dates and/or times and perform a set of predefined functions from a table of functions. Also a DISCONNECTED-but-running service machine does not need a terminal to operate when it is DISCONNECTED.
Normally DISCONNECTED-but-running service machines that have been "LOGGED OFF", that is to say "signed off" from the system and are in an inactive, DISCONNECTED condition requiring a process of connecting back to the CP mode and then to the CMS mode and which leads into a generic system such as DOS or OS/2 or a competitive operating system or that have stopped running but are still DISCONNECTED have been a recurring problem. The key problem is a DISCONNECTED but not running condition. These service machines (SERVERS) share information in a network and prepare the information to be presented to users. If a service machine (SERVER) in a network goes down (not in the sense that the VM system ceases to function, but in the sense that the information network that consists of the network of DSMs ceases some of its functions), the chain of information is broken and information will not get to users.
These interruptions are a critical flaw in any on-line system established exclusively for real-time data exchange, such as the intersite Line Comparison (ILC) system described in U.S. patent application (FI9-91-037) Ser. No. 07/755,036, filed on Sep. 4, 1991 of Dauerer et al for "Database System for Intersite Line Comparison".
Restarting of the service machines (SERVERS) that have been improperly or accidentally stopped often depends on prompt notification that a machine is down. A SERVER that is down cannot make notifications that it is down and the administrator, which can be a machine (or alternatively in another aspect of the invention, a human being) must depend on other means of finding out that such a condition exists. A SERVER that has stopped running, but is DISCONNECTED, presents greater difficulties than a SERVER that has merely been LOGGED OFF, since this condition limits the available options of determining the status of the SERVER.
Presently the ability exists to send queries to service machines (SERVERS) to determine their status. If the "message query" is sent from a central USERID, the query wall report whether the SERVER is LOGGED OFF from the VM system, LOGGED ONTO the VM system, or DISCONNECTED from the VM system. There are several problems with this method, as follows:
1) The messages are sent in two directions and there may be problems with sending message in the first direction, i.e. from the central USERID.
2) The query wall not detect a condition in which a SERVER is DISCONNECTED and not running.
3) There As no central USERID that wall automatically sort out the messages and report on only the down conditions.
Another currently available method is to send a specific transaction, depending on the type of SERVER that is running. This solution has not only all the disadvantages of the first method but has additional problems. For example transactions are slower than message queries, transactions must wait in the queue of the receiving SERVER, and a different transaction must be formulated for each type of SERVER.
An object of this invention is to detect not only the "LOGGED OFF" condition but to detect the "DISCONNECTED but not running" condition, and automatically to notify the appropriate users, whether they are operator stations for humans or machines to take the appropriate action to restart the SERVERs.
Another object of this invention is to provide DSMs with independence.