The present invention relates to computer networks, and more particularly, to management of components in computer networks.
Information Technology (IT) systems, methods and computer program products, including, for example, computer networks, have grown increasingly complex with the use of distributed client/server applications, heterogeneous platforms and/or multiple protocols all on a single physical backbone. This increase in the complexity of systems may make solution management more complex. Solutions may include collections of software and hardware components to address specific customer business requirements. In a solution, problem determination (PD) may include problem detection, isolation, and resolution using components participating in a solution across a multiplicity of platforms.
In conventional automatic computing system management, also known as autonomic computing, components, such as applications, middleware, hardware devices and the like, generate data that indicates the status of the component. An adapter may be used to convert this component status data into a common format. For example, International Business Machines Corporation's Generic Log Adapter (GLA) may be used in autonomic computing systems to collect data from different data sources with many different formats. The GLA is a rule-based engine that can translate data from different native log formats into a standard format, known as the Common Base Event format (CBE). This component status data will, typically, be consumed by some management function utilized to monitor the system and/or for problem analysis/resolution. The management function may, for example, be a management program that is consuming the data for analysis and/or display.
Knowledge bases have conventionally been used to map component status data, such as error log messages, to symptoms and eventually to fixes for problems. For example, there are symptom databases utilized by IBM, Armonk, N.Y., that map WebSphere error log messages to symptoms and fixes. These databases typically work on the assumption that if a specified error message (e.g., message “123”) or sequence of error messages is received from a specified component (e.g., component “XYZ”), then a particular problem is occurring (e.g., the performance is slow) and a predefined corrective action (e.g., increase the parameter “buffsize” to 10) will likely fix the problem. However, in some instances, an operational error and/or other problem can manifest in a variety of sources as a symptom of a larger root problem elsewhere.