1. Field of the Invention
The invention relates generally to resource management, and in particular, to monitoring resources in a computing environment.
2. Background Information
A business typically uses a number of hardware and software assets to support the operation of the business and to provide services to customers. These assets typically include traditional computing hardware such as workstations, servers and printers, network hardware such as routers, switches and firewalls, and software assets such as billing systems, customer databases and network management applications.
To effectively manage these assets, businesses typically make use of a number of specialist software applications that are focused on addressing a specific need. For example, a network management application focuses on managing the computing and network assets used by the business. Typically, such applications employ various techniques to manage the assets, such as: Discovery technology to find assets and the relationships between them; A model, typically based on a standard, such as Distributed Management Task Force Common Information Model (DMTF CIM), and typically implemented using an relational database management system (RDBMS); A user interface that allows the user to interact with the managed resources; A configuration mechanism that allows the behavior of the management application to be tailored to suit the business needs; An event/alarm database and associated functions such that the resources may be actively polled or that events/alarms originating from the managed resources may be categorized, managed, and archived; A system, commonly referred to as a Root Cause Analysis (RCA) engine, that attempts to identify the impact of events/alarms in context of the topology held in the applications model; etc.
A summary of certain characteristics of how resources are typically monitored is provided below. An example is that which typically results in an event or alarm being created, updated or deleted in an event management system (such as IBM Tivoli Netcool Omnibus). A resource may be actively monitored by a management application in a number of ways including, but not limited to, ICMP echo polling (pings) and Simple Network Management Protocol (SNMP) data retrieval. A management application may passively listen for alarms originating from managed resources. Typically these include SNMP traps or informs. A management application may parse log files generated by resources for specific information. The management application may use resource-specific probes to obtain information in a resource-specific way, such as via a published API (Application Programming Interface). Events or alarms resulting from the points identified above typically have a notion of severity that indicates how the event should be considered with respect to whether or not it is a problem or resolution. Varying degrees of severity are typically provided for, such as clear (a resolution event), warning (be aware of a certain condition) and critical (a managed resource has a problem that requires attention). A management application may apply some additional processing to events or alarms relating to resources. This processing includes thresholding of data to upgrade or downgrade severity if an aspect of the event/alarm data exceeds or drops below an arbitrary threshold, or considering an event or alarm with respect to a topology or resource model. Although such techniques may satisfy typical use-cases, the monitoring capabilities of management applications remain limited.