1. Field of the Invention
The present invention generally relates to a system for monitoring an operating system. More specifically, a monitor mechanism based on a pseudo filesystem, using standard filesystem interfaces and a file tree representation, takes event requests from event consumers and forwards them to the producers, receives event occurrence information from the event producers, and notifies the event consumers of this information, eliminating the need for periodic polling and specialized monitoring APIs.
2. Description of the Related Art
System administrators in large data centers typically monitor the health of an operating system (OS) running on a computer server with centralized monitoring applications. Such a centralized monitoring application typically has an agent that runs in each OS instance and typically does periodic polling to collect OS-health related data. This data is then analyzed by either the central application, or by the agent in the monitored OS itself, which typically relays the information to the central application. Whenever bad health in a monitored-OS is detected, the central monitoring application generates an event to the system administrator. In the simple case, this event can cause a notification (e.g. email) to be sent to the administrator, or in a more sophisticated case, a corrective action can be taken (e.g. the execution of a specified program).
There are two major problems with the periodic polling approach of existing OS-health monitoring applications:
1. Since OS data changes dynamically and rapidly, the freshness or accuracy of the data collected, and the ability to take action based on this time-sensitive data, are dependent on the length of the polling period. If the period (time between two consecutive polls) is too long, then the OS may have gone into an “unhealthy” state after a poll and before its subsequent poll, so that the problem cannot be detected, and hence corrective actions cannot be taken in time.
2. The polling activity adds overhead to the OS. The more polling users there are in the OS instance, the more the overhead.
To address the above problems, several non-polling event notification methods have been proposed in the past, each requiring the application to use a specialized monitoring API (Application Programming Interface) to register its interest in being notified of an event and to get more details on an event after the event has occurred.
The problem with specialized monitoring APIs are that language bindings have to be developed and maintained for most of the predominant languages used for system management applications, and there are many of these languages, e.g., Perl, C, C++, Java, Python, etc. The complexity of having to maintain a specialized API over multiple versions of the OS, and over multiple languages in an OS version, and over multiple versions of each language that are supported within one OS version, often deters the use of these specialized APIs by existing systems management tools.
Thus, in view of these problems with polling and specialized APIs, a need exists for a more efficient method of monitoring the health of operating systems, preferably using a method that does not require polling and does not use specialized APIs.