1. Field of the Invention
Embodiments of the present invention generally relate to techniques for monitoring processes running on a computer system and, more particularly, to a method and apparatus for monitoring at least one process during an abnormal exit of the process.
2. Description of the Related Art
Monitoring computer system processes include monitoring process activities, including starts and exits of a process among other process activities. Monitoring a process also helps detect process exit events and determine if the process exited gracefully or abnormally, so that an appropriate corrective action may be taken. Most conventional operating systems provide a mechanism for monitoring processes having ancestral relationships, for example, a parent process may monitor various child processes. Further, in POSIX compliant operating systems, system calls such as, “wait(2)”, “waitpid(2)”, “wait4(2)” and the like provide a mechanism to monitor child processes. However, such operating systems do not provide a mechanism for monitoring processes without ancestral relationships to a parent process.
Various efforts have been made to provide a mechanism to monitor processes without an ancestral relationship between the processes. One approach is to have a monitoring process continuously polling the processes to be monitored. The polling is done at regular intervals to check the status of the process. This approach has many limitations. For example, an instantaneous notification of a process exit is not possible, and further, repeated polling by the monitoring process wastes valuable central processing unit (CPU) resources. More importantly, this approach does not provide information on whether the process exited gracefully or abnormally.
Another approach is to re-architect the processes to be monitored to make them co-operative with the monitoring process. However, this may require modifying the existing processes and applications, and therefore such a solution is not considered a viable approach because the costs required to implement such modifications generally outweighs the benefits.
Further, in various computing environments, when a process exits abnormally and an abnormal exit of the process (a failed process) is detected, a special processing technique, such as a failover process, is invoked. In case of an abnormal exit of the process associated with a Transmission Control Protocol (TCP) connection, the failover process may be needed for the TCP connection. In general, a TCP connection includes a local end and a remote end. The local end of the TCP connection is associated with the process and the remote end of the TCP connection is connected to a user computer. In cases of a TCP connection being failed over, the failed process associated with the connection is re-initiated or another process capable of handling the failed TCP connection is initiated. One of the major unmet challenges while failing over the TCP connection is that the remote end of the TCP connection should not be disturbed.
Accordingly, there exists a need to provide a method and apparatus for monitoring processes to detect abnormal process exit.