The present invention relates to methods for monitoring and testing the performance of a web site or other server system as experienced from multiple user locations on a computer network.
The performance of a web site or other Internet server system, as experienced by end users of the system, can vary significantly depending on the geographic locations of the users. For example, users in London may experience much greater response times than users in San Francisco. Such variations in end user experience may occur, for example, as the result of Internet traffic conditions, malfunctioning Internet routers, or malfunctioning DNS (Domain Name Service) servers.
The ability to detect such location-dependent problems can be valuable to web site operators. For example, if users in a particular geographic region are known to frequently experience long response times, the web site operator can set up a mirror site within that region to service such users. The web site operator can also benefit from knowing whether a given problem is limited to specific geographic regions. For example, if it is known that a particular problem is seen by users in many different geographic locations, the web site operator can more easily identify the source of the problem as being local to the web site.
Some companies have addressed such needs of web site operators by setting up automated services for monitoring web sites from multiple geographic locations. These services are implemented using automated agents that run on computers at selected Internet connection points, or xe2x80x9cpoints of presence.xe2x80x9d The points of presence are typically selected to correspond to major population centers, such as major cities throughout the world. The agents operate by periodically accessing the target web site from their respective locations as simulated users, and by monitoring response times and other performance parameters during such accesses. The agents report the resulting performance data over the Internet to a centralized location, where the data is typically aggregated within a database of the monitoring service provider and made available to the web site operator for viewing. The collected data may also be used to automatically alert the web site operator when significant performance problems occur.
A significant problem with the above approach is that the cost of setting up and maintaining agent computers in many different geographic regions is very high. For example, the monitoring service provider typically must pay for regional personnel who have been trained to set up and service the agent software and computers. Another problem is that users of the service can only monitor performance as seen for the fixed agent locations selected by the service provider.
The present invention provides a monitoring system and service in which community members monitor their respective Web sites, or other server systems, as seen from the computing devices of other community members. In a preferred embodiment, the system includes an agent component that runs on the computing devices of community members to provide functionality for accessing an monitoring end-user performance of a server system. By running the agent component on a computing device, a user effectively makes that computing device available to other community members for use as a remote monitoring agent. The agents are remotely programmable over a network, and may be programmable, for example, to execute a particular Web transaction while monitoring specified performance parameters (server response times, network hop delays, etc). In one embodiment, the agent component monitors performance only when the host computing device is in an otherwise idle or a low CPU-utilization state, and thus does not interfere with the ordinary operation of the host device.
The system also includes a controller that communicates with the agent computing devices over the Internet or other network. The controller keeps track of the agent devices that are currently in an active state (e.g., connected to the Internet with the agent component running) by monitoring messages transmitted by the agent devices. The system also includes a user interface (preferably part of the controller) that provides functionality for users to set up sessions to monitor their servers from the active agents (typically geographically distributed). The interface may allow the user to select the agent devices from a real time directory of active agents, and/or may allow the user to specify criteria for the automated selection of the agents.
Once a monitoring session has been set up, work requests are dispatched (preferably by the controller, or alternatively by an originating agent) to the selected agent devices. These work requests preferably specify the server system, transactions, and performance parameters to be monitored by the agents. Performance data collected by the agents during the course of the monitoring session is collected in a database and is made available for online viewing. The performance data collected by the agents may be used both to generate server-specific reports and general Internet xe2x80x9cweather maps.xe2x80x9d
An important benefit of using shared community resources to host the agent software is that it reduces or eliminates the need for the service provider to set up and administer agent computers at various points of presence. Another benefit is that users are not limited to the fixed agent locations selected by the service provider, but rather can monitor their systems from the user locations of any other community members.
In accordance with one aspect of the invention, the controller preferably monitors the extent to which each user contributes processing resources to the community, and compensates each user accordingly. Preferably, the compensation is in the form of credit toward future use of the service. For example, once Company A has allowed other community members to execute one hundred Web transactions from Company A""s computers, Company A may be permitted to execute one hundred transactions from the computing devices of other members. The use of such a reciprocal usage policy desirably encourages users to make their computing devices available to other community members when the devices are not in use.
In accordance with another aspect of the invention, the agents may implement an algorithm for measuring hop delay along the route between the agent device and a monitored server. When one agent detects a slow hop, the controller (or the agent) may automatically invoke other agents (preferably agents that frequently use the subject hop) to further test the hop. In this manner, the hop may be tested concurrently from multiple agent devices and locations to more accurately determine whether a router problem exists.