The present invention relates generally to network monitoring systems. More specifically, the present invention relates to monitoring the performance of servers in a networked environment.
Various methods and tools have been used in the prior art to monitor the performance of network computer devices such as servers. Such monitoring tools include ping, port monitoring, and agents. None of these methods and tools have been totally reliable and they often provide false positives or miss actual failures. A brief description of the aforementioned tools and associated shortcomings is provided below.
The Ping utility is essentially a system administrator's tool that is used to see if a computer is operating and also to see if network connections are intact. Ping uses the Internet Control Message Protocol (ICMP) Echo function which is described in RFC 792. A small packet is sent through the network to a particular Internet Protocol (IP) address. This packet contains 64 bytes−56 data bytes and 8 bytes of protocol header information. The computer that sent the packet then waits (or ‘listens’) for a return packet. If the connections are good and the target computer is up, a good return packet will be received. One solution to monitoring server performance is to ping the servers to be monitored and provide an alert when a ping fails. This solution has proven to be ineffective because often the server could be hung while the network interface card was still responding to pings.
Another common technique is to provide an agent on each server to report back status to a monitoring device or server. This approach can cause false negatives which result in volumes of unnecessary support calls, or the reverse, i.e., no information of an actual failure.
Another common technique is to proactively monitor the server port (port monitoring) on a server to report back status to a monitoring device or server. A variation of this approach is to simply attempt to connect to the server. This can cause false negatives which result in volumes of unnecessary support calls or the reverse, i.e., no information of an actual failure.
A typical server system can provide numerous services to associated client devices. The operating status of an individual service can be determined by sending a query to the monitored server. The server sends a reply that includes the operating status of the service, indicating whether the service is running. The queries and replies usually include at least one query and reply per service status requested. In order to determine the status of multiple services at a single server, the monitoring server must send multiple queries and receive multiple replies.
Another approach is to generate a single query for all of the services provided by a monitored server. The monitoring server sends the query representing a request for the status of multiple services on the monitored server. The monitored server generates a compilation of information regarding the services that it offers, and transmits this information to the monitoring server. Although this approach reduces the number of queries directed to a monitored server, it also generates a significant amount of data on all services running on the monitored server.
Thus, there is a need for a monitoring system that does not rely on pings, agents, server connections, or port monitoring and thus does not have their associated vulnerabilities, but instead monitors the functionality of the actual device.