Modern day businesses cannot function efficiently without use of state of the art technology. Specifically, computers and software are an almost essential part of most of the businesses in developed economies. Computers are increasingly becoming interconnected at higher and higher levels through networking technologies. The Internet is currently the most visible manifestation of this trend. However, smaller scale networks, however, are also becoming more widespread. With the proliferation and wide spread use of computer applications, intranets, the Internet, and corporate networks, performance of networks have become an important factor determining the overall performance of computing systems using these networks and thus affecting the businesses using these computing systems.
Networking is one of the most complicated areas to configure, manage and diagnose, which is often reflected in poor end user satisfaction numbers, slow network performance, etc. Networks generally have a stack architecture where components function at different levels. Each component relies on the components functioning at lower layers to provide functionality to components sitting on the higher layers. Failure of a component at a certain layer affects all the components in the stack sitting above it. Conversely, overuse of a component can affect components in the stack sitting below it. While many operating system network components in the networking stack do their own diagnosis, each of them deals only with errors regarding their own components. Therefore, users of a network are left with the task of understanding the interrelationships between all the networking components affecting the performance of a given layer, which quite often proves to be very hard and frustrating.
Work to date in network management and troubleshooting has concentrated generally on effectively managing a single network. Typically, it has also been assumed that the network management software and the managed devices are all owned by the same administration. However, in today's complex networks there is a need for coordinated network management across various administrative domains. For example, while a majority of network routing problems can be identified as software and configuration errors, a significant number of network routing problems are often classified as “somebody else's problem” where each party involved ends up pointing to the other party as the root cause. Such problems are generally quite difficult to solve and involve a fair amount of coordination among various networks.
A second important problem with current network diagnostic systems is the communication of the problem and solutions to various parties related to the problem. It is quite often possible that the true cause of any problem may be distant from its effect. For example, failure to access a web page may be a result of a problem located anywhere between a user's browser and a remote server. In such cases it is important that once the cause of the problem has been identified, information about such a problem can be conveyed to the affected and interested third party along with any troubleshooting information, such as estimated time of repair, etc. Moreover, such troubleshooting and communication about it needs to work even when there is at least some problem in the communication network.