Distributed computer systems are well known today. They may comprise multiple servers interconnected via networks to handle client workstations. For example, a client workstation requests a web service and the request is forwarded from a proxy server to a web server in the distributed computer system. The web server itself may not possess all the applications or data to fully respond to the client request. In such a case, the web server may forward part or all of the client request to another server or generate another request for the other server to obtain the requisite service or data. For example, the client request can be to make travel reservations involving airplane tickets, hotel reservations and a rent-a-car, and a (front end) web server acts as the interface to the client. Upon receiving the request from the client for airplane tickets, the front end web server may forward the request, via a network, to another server on which an airplane reservation application runs. Likewise, upon receiving the request from the client for hotel reservations, the front end web server may forward the request, via another network, to another server on which a hotel reservation application runs. Likewise, upon receiving the request from the client for a rent-a-car, the front end web server may forward the request, via another network, to another server on which a rent-a-car application runs. In this example, each of these other servers does not itself manage the corresponding database, so each of these other servers requests the corresponding data (i.e. availability, pricing, etc.) from a respective database, via respective networks. Thus, in this example, multiple servers may be required to respond to the customer request for a compound travel reservation. Likewise, other types of server applications, such as a messaging server, an authentication server, a batch server or a reporting server, may be required to assist an application server in handling a client request.
To successfully respond to the customer request for compound travel reservations (or to a customer request for another service requiring assistance from other types of server applications), all of the requisite servers must be operating, and all of the network connections between them must be active. (if any of the servers is in a cluster, then at least one server in the cluster must be operating.) Occasionally, one or more of the servers (or server clusters) or the network connections between the servers (or server clusters) fails. The point of failure or even the nature of the failure may not be apparent to a systems administrator responsible for troubleshooting the problem. The troubleshooting task is compounded by the fact that there may be hundreds of active ports and network connections at any one time between the servers. Also, changes in configuration of the distributed computer system may have been made, but not reflected in troubleshooting documentation. So, it may be difficult to determine which servers to troubleshoot.
Accordingly, an object of the present invention is to facilitate the identification of a network connection or associated server which has failed.