The present invention relates to methods and apparatuses for controlling the operation of a digital processing system. More specifically, the present invention relates to methods for monitoring the status of the operation of software on a digital processing system.
Conventional digital processing systems, such as modem computer systems, are often capable of executing software programs without the aid or monitoring by a human user. Often, several software programs may be run concurrently on the same computer system. For example, a computer system which is acting as a web server, or some other type of server in a computer network, may operate without the interaction or guidance from a local human user and may execute several software programs at the same time. Even though such systems generally work well without local user intervention, computer systems sometimes fail in one way or another. For example, a particular software program may have an internal failure, causing that program to stop functioning correctly on the computer system while other programs continue to operate properly. Alternatively, a more fundamental problem may develop in which the entire computer system xe2x80x9ccrashesxe2x80x9d to the point that the computer is hung in a state in which it refuses to respond to all user interaction.
The prior approaches to solving these problems usually involves relaunching, through user interaction, the program which is at fault or by restarting, through user interaction, the entire computer system usually by turning the power off and then turning the power on in order to restart the entire computer system. Both of these techniques require a human user to provide instructions or otherwise operate the computer in such a manner as to either relaunch the application or to restart the entire computer system. Another approach in the prior art has allowed the user to remotely access a stalled computer system to restart the computer system upon discovering remotely, for example, that the computer system has crashed or is otherwise not responding. An example of this particular approach is described in U.S. Pat. No. 5,347,167 by Amar Singh of Sophisticated Circuits, Inc., of Bothell, Wash. A remotely located user may determine that a computer system has crashed and may, through the use of a telephone interface, cause the remotely located computer system to restart. This particular approach, however, requires user monitoring of the status of the remotely located computer and also requires user interaction once it is discovered that the computer system has failed. Further, this system does not provide monitoring on a program by program basis and does not allow the user to specify various actions in response to the failure of one program versus another program.
In view of the foregoing, it is desirable to provide an improved method and apparatus for controlling the operation of a digital processing system such as a computer system.
The present invention discloses methods and apparatuses for controlling the operation of a digital processing system. A method in one example of the invention receives a first status indicator for a first software program which is executing on the digital processing system. It is determined whether the first software program is in a first state; typically, this may occur by examining the information provided by the first status indicator. In response to determining that the first software program is not in the first state, then a first predetermined function is performed.
In one particular embodiment of the present invention, several additional status indicators may be received, one for each of several software programs which are executing on the system. For each additional status indicator, it is determined whether the corresponding software program is in the first state, and if it is not in the first state, then a corresponding, predetermined function is performed, such as (for example) relaunching the corresponding software.
In one particular embodiment, the first status indicator, which indicates the first software program is not in a fault state, is provided by the first software program to another software program which is also executing on the system. In this particular embodiment, the first status indicator resets a counter which corresponds to the first software program and which is controlled by the another software program. The another software program determines the first software program is in a fault state by examining the value of the counter.
Computer systems which practice the methods of the invention are also described. Further, computer readable media having software which allows the computer systems to perform the methods of the present invention are described.