1. Field of the Invention
The present invention is related to communications in distributed computing systems, and more specifically to endpoint-to-endpoint monitoring of communication status in node-based computing schemes.
2. Description of Related Art
In large-scale distributed computer systems, such as those using distributed software models to perform tasks, multiple nodes provide independent execution of sub-tasks. In order to keep such a system operational, and further, to provide for proper operation of distributed applications that use the multiple nodes to perform various tasks, the ability to communicate with other nodes is tracked. In particular, the operation status of hardware interfaces that connect the nodes is monitored and used to determine whether other nodes in the system can be communicated with. Further node status is monitored to ensure that nodes that are to be used to perform tasks are operational.
Communications and status monitoring is typically centralized, with a monitoring application providing information about node and interface status. The monitoring application may use distributed agents to perform the monitoring on each node.