1. Field of the Invention
The present invention relates to a method and a system for managing programs in a data-processing system.
2. Description of the Related Art
In recent years, since many operations of companies and the like are performed with the use of data-processing systems, high credibility and availability are required for the data-processing systems. On the other hand, if failures or abnormalities occur in the data-processing systems, administrators of the data-processing systems are required to investigate causes and implement countermeasures quickly and accurately, in order to minimize economic losses due to suspension of business and loss of credibility from customers.
Therefore, a variety of technologies are developed for performing diagnostic of failure statuses in the case of failures of the data-processing systems. For example, such a technology is disclosed in Japanese Patent Application Laid-Open Publication No. 08-305600.
However, in the current data-processing systems with large and complicated configurations, it is often difficult to even estimate the point causing failures. For example, in some cases, each component such as an application server, a storage apparatus or network equipment is installed in a geographically remote area. Also, if a manufacturer is different for each component constituting the data-processing system, cooperation may not be obtained from each manufacturer.
In these conditions, one day, an administrator suddenly finds out that a large amount of messages are sent from each component of the data-processing system, which informs abnormality in detail. In this case, the administrator must spend considerable time and effort to identify the point causing the failure.
Therefore, for the case that a failure occurs in the data-processing system, a technology is required for quickly narrowing down the components causing the failure. In a technology performing autonomous control of the data-processing system and enabling autonomous recovery from the occurring failure, it is especially important that the failure occurrence point can be quickly narrowed down.