The present invention relates generally to distributed systems, and more particularly to remote administration in distributed systems.
Processing devices or systems are widely used for a variety of applications, from controlling complex machinery to providing a communications medium between users. In many instances it is critical that the processes performed by a processing device or system are maintained within acceptable tolerances. For example, due to various circumstances, a process can consume a disproportionate amount of the system resources of the processing device, thereby preventing other processing from operating in a desirable manner. Likewise, processes often can become inoperable, whether by termination by a user or as a result of a malfunction of the process or processing device. The operation of critical systems often results from the excessive consumption of system resources or the termination of a process. For example, processing devices often are used to control the operation of various manufacturing machines in a manufacturing line. If the control program (a process) used to control one of the machines unexpectedly terminates, the machine could malfunction, which is typically costly in repairs and/or lost manufacturing time.
The management of the processes performed by one or more processing devices is further complicated when the processing devices are disbursed over relatively large distances as clients to a central management system. In this case, it often is difficult to effectively and efficiently manage the operation of the clients due to the time and effort necessary to travel between the locations of the processing devices. Accordingly, mechanisms have been developed to allow administrators to remotely view the status of a client, and in limited instances, modify the operation of the client remotely. However, these mechanisms have a number of limitations. For one, the clients typically require feedback from the central management system before their configuration and/or operation can be altered. Should the communications link fail between the central management system and a client or the client becomes inoperable (i.e., freezes up), there usually is no way for the administrator to remotely direct the client to reset or attempt to reconnect. Likewise, the delay introduced while the client is waiting for feedback or direction from the central management system often can hinder the efficient operation of the client in the event that has been determined to exceed its threshold system resource consumption.
Mechanisms therefore have been developed to address the management and control of processes performed by a client by monitoring their system resource consumption. These mechanisms typically compare one or more system resource consumptions by a process to one or more corresponding thresholds. When one or more of these thresholds are exceeded, the monitoring mechanism often performs an action to address the excessive system resource consumption, such as by displaying a query box informing the user of the status of the process or terminating the process. However, the use of use thresholds to detect problematic processes, their implementation is somewhat limited due to the discrete nature of the threshold. For example, assume that a process is allotted a maximum (i.e., a threshold) of five megabytes (MB) of virtual memory. In this case, any degradation in the performance of the process generally would only be detected once the process consumes more than five MB of virtual memory. If the process consumed only 4.9 MB of virtual memory, it could be assumed that the performance of the process is, or will be shortly, degraded to some extent. However, since the consumption of 4.9 MB of virtual memory does not exceed the threshold, no indication typically is made that the performance of the process is degraded or that degraded performance is imminent.
Accordingly, an improved system and method for remotely managing clients performing critical processes would be advantageous. Specifically, a method and system for managing processes at a client using a more accurate determination of the status of processes is needed.