The present disclosure relates to computer networks. Specifically, the present disclosure relates to systems and methods for monitoring or testing computer networks.
Many users of computer generated information or data often store the information or data locally and also replicate the data at remote facilities. These remote facilities can be on multiple sites, perhaps even around the world, to ensure the data will be available in case one or some of the facilities fail. For example, a bank may store information about a person's savings account on a local computer storage device and may replicate the data on remote storage devices around the country or around the world. Thus, information regarding the savings account and access to the funds in the savings account is available even if one or some of these storage devices were to fail for whatever reason.
In general, computer data is generated at a production site and can also be stored at the production site. The production site is one form of storage area network. The production site is linked over a wide area network, such as the Internet or a dedicated link, to one or more remote alternate sites. Replicated data is stored at the alternate sites. The alternate site is another form of storage area network. Often, a storage area network can be a hybrid where it functions to generate and store local data as well as replicate data from another storage area network. Many storage area networks can be linked over the wide area network. In the example above, one storage area network could be at a bank office. The storage area network is connected over a wide area network to remote locations that replicate the data. These locations can include other bank offices or a dedicated storage facility located hundreds of miles away.
The computer network is operating smoothly if certain service level criteria are met. The described computer networks include hundreds of components including hardware and software components that may be scattered throughout the world. If one or more components fail and at least some of the service level criteria are not met, data stored on the network may be unavailable, performance may be affected, and other adverse symptoms can occur. Research has demonstrated that a user of the computer network, such as the bank, will take fifty-four minutes to report a critical failure to a network administrator. During this time, the computer network has not been operating properly and the benefits of storing information at multiple locations has been reduced or lost.
A number of solutions are available to prevent certain types of local problems from occurring, before they arise. Other solutions require a technician to come to a user to test the computer network. Of course, this test is performed at lengthy intervals because of associated costs and logistics. These solutions suffer from the disadvantages of either not testing the network completely or too infrequently. Further, if problems do arise in the network, the user is generally required to alert the network administrator. In this scenario, valuable network time is lost before the network administrator is even ready to respond to the user, let alone address the problem.
Additionally, solutions are available that test the components of the network. These solutions test selected components in a point-by-point method. Such a method does not provide a complete and accurate picture of the network. For example, the point-by-point method only tests part of the network at a time, and this part may be shared in whole or in part by other traffic. Additionally, this test taxes the processing power of the network component, which is not necessarily related to its ability to handle data. Traffic is choked while the network component is tested because the processor is running a test rather than being stressed with data.