With the spread of the Internet, web applications, and lower-cost and higher-performance computer hardware in recent years, more and more systems have been developed in a distributed network environment. That is, instead of centralizing all data and programs into a single, large, and expensive mainframe computer as was previously the case, many inexpensive computers are connected through a network to form a system. Although each of such inexpensive computers cannot compete with a mainframe computer in terms of throughput and reliability, it is possible to ensure data security by sharing the same data among a plurality of computers. This is because even if a failure occurs in one computer, the shared data can be provided by another computer. Moreover, by carrying out processing by a plurality of computers in a parallel and distributed manner, high throughput that is comparable to that of a large mainframe computer can be achieved by the entire distributed system.
However, in a distributed processing environment where processing is performed in parallel by a plurality of computers that are geographically and logically distributed, it is difficult to implement a system change associated with a system failure, extension, or the like. Since each of computers constituting the distributed processing environment is typically less reliable than a mainframe computer, it is more likely that any of the computers will fail at some point. As described above, a failure of one of the plurality of computers does not immediately affect the operation of the entire system. However, eventually the failed computer needs to be replaced or repaired as soon as possible.
However, in a distributed processing environment, due to its geographically and logically distributed features, it is not necessarily easy to locate the physical location of such a failed computer or determine how the failure of the computer affects logical dependencies between processes performed by software programs.
Besides failures, a change of system configuration also occurs frequently. For example, assume that a company has launched a website. The company has estimated a load on the website on the basis of the predicted volume of traffic and has built a web server with desired performance. However, it often happens that the server goes down, since the website gathers unexpected popularity and is accessed by far more visitors than expected. To cope with such a situation, the server may simply be replaced with one with greater capacity. Other possible solutions include use of failover clustering in which processing is passed to another server with an identical configuration upon failure of one server, and use of load distribution clustering in which a mechanism of a round robin or load balancer is used. However, in any case, geographical and logical relationships between components of the distributed processing environment may be greatly changed and thus, it may take considerable effort to reconfigure the existing system as an integrated system. Moreover, it is possible that the system resulting from the reconfiguration may not operate properly. In fact, some statistics show that 85 percent of system failures are caused by system changes.
Thus, operational costs in a distributed processing environment have been increasing. Since a distributed network system having a size exceeding a certain level is not manageable by human intervention alone, it is necessary to use an appropriate management system. This involves system management costs and operational costs (including personnel costs), which are said to be as much as 70 percent of the total IT costs.
Exemplary concepts of operation management tools for use for such purposes include a concept of a Configuration Management Database (CMDB) compiled by the Information Technology Infrastructure Library (ITIL) (British government's trademark). This is a system which collects information about logical dependencies or interactions between components of a distributed network, such as information about the configuration of each of computers connected to each other, information about applications running on the computers, configuration information about a network-attached storage (NAS) connected to the computers, and configuration information about a storage area network (SAN) directly connected to the network. The collected data may be passed to a graphical user interface (GUI) display tool, in which connections between a web server (e.g., Apache), an application server (e.g., WebSphere (IBM's trademark)), and a database system (e.g., DB2 (IBM's trademark)) are represented by blocks and links therebetween.
A product called Change and Configuration Management Database (CCMDB) provided by International Business Machines Corporation (IBM) implements the CMDB and is, at the same time, capable of managing configuration changes. The CCMDB uses a secure shell (SSH) to automatically and remotely execute a necessary command and collect data. These functions are described in PCT publications Nos. WO2004/010246, WO2004/010292, WO2004/010293, and WO2004/010298.
Japanese Unexamined Patent Application Publication No. 2000-13372 discloses a technique for managing facility information and location information of a device together, using a unique number of a network node as a key. With this technique, a physical connection configuration of network nodes to be managed is stored in a physical database, logical operation information resulting from monitoring of the network nodes is stored in a logical database, and current operation information and physical operation information retrieved from the physical and logical databases with respect to a specific network node are displayed on a display unit.
Japanese Unexamined Patent Application Publication No. 2005-292906 relates to a system for managing asset information and discloses a technique in which a physical identifier for identifying an asset and a logical identifier (e.g., Internet protocol (IP) address) corresponding to the physical identifier are stored, a physical identifier corresponding to an entered logical identifier is retrieved, and asset information corresponding to the retrieved physical identifier is output.
Japanese Unexamined Patent Application Publication No. 2006-79350 discloses a technique in which a media access control (MAC) address of a network card attached to a computer is associated with the computer's main body and stored in a database, and the location of the computer's main body is displayed on a layout screen such that the computer can be tracked even when it is moved.
With the conventional techniques described above, it is possible to provide information about dependencies between software programs running in a distributed processing environment. It is also possible to provide a method for managing, using unique physical information such as an MAC address, physical location information of a computer for running software programs.
With such a scheme of the conventional techniques described above, it is possible to detect dependencies between logical objects, such as software programs, to create link information, and possible to identify physical location information on the basis of a MAC address. However, with the scheme of the conventional techniques described above, it is not possible to properly associate a software program with hardware on which the software program is running. In fact, the CMDB framework is designed such that information about the physical location of hardware is abstracted as much as possible, and that the location of a computer and a software program running on the computer are rather not to be detected.
In practice, however, if an air conditioner in Room B on the first floor of Building A fails and the room temperature becomes too high for computers to operate, or if a power failure occurs in an area where computers are located, it is necessary to locate software programs running on such computers and thus affected by such a problem.
However, it is difficult for the conventional scheme to automatically detect software programs running on a computer in a particular area, since the physical location of hardware is abstracted. It may be possible to detect such software programs by obtaining, using a function of a network, a MAC address of a network card attached to a computer on which the software programs are running. In this case, it is necessary to manually refer to a hardware master data library using the obtained MAC address as a clue. This requires a visual check involving a heavy human workload.