1. Field of the Invention
The present invention relates to methods and apparatus for identifying chronic performance problems on data networks.
2. Description of the Related Art
Packetized data networks are in widespread use transporting mission critical data throughout the world. A typical data transmission system includes a plurality of customer (user) sites and a data packet switching network, which resides between the sites to facilitate communication among the sites via paths through the network.
Packetized data networks typically format data into packets for transmission from one site to another. In particular, the data is partitioned into separate packets at a transmission site, wherein the packets usually include headers containing information relating to packet data and routing. The packets are transmitted to a destination site in accordance with any of several conventional data transmission protocols known in the art (e.g., Asynchronous Transfer Mode (ATM), Frame Relay, High Level Data Link Control (HDLC), X.25, IP, Ethernet, etc.), by which the transmitted data is restored from the packets received at the destination site.
The performance of these networks can be effectively measured using performance metrics such as packet latency, jitter (delay variation), and throughput. Persistent chronic network performance problems result in significant degradation in the performance of the network. Chronic latency, jitter, and throughput problems dramatically impact the transfer of important information and consequently impact business productivity and effectiveness.
Chronic excessive latency is a significant increase in the time required for data to traverse the network and is one of the major underlying causes of user dissatisfaction with network performance and service levels. Durations of persistent chronic excessive latency may occur for hours or days. Users typically experience chronic excessive latency as substantial increases in application response time and/or failed transactions.
Chronic excessive jitter may render voice or video streaming unintelligible. Excessive jitter will cause an unacceptable number of packets to be excluded from a reconstructed real-time output signal resulting in perceptible distortions in an audio or video output signal. Users typically experience chronic excessive jitter as a substantial disruption in their ability to understand the streaming media (e.g. voice or video) that they are receiving.
Chronic excessive throughput problems may prevent critical backup or disaster recovery functionality. Users typically experience chronic excessive throughput problems as a substantial increase in the time required to access a remote resource.
Most packetized data networks exhibit predictable, stable behavior in terms of latency, jitter, and throughput. For example, most data packets traversing a domestic network may require 70 ms for a request and reply to traverse the network. This average latency may remain fairly stable over extended periods. Nevertheless, during these periods, networks can experience transient increases in traffic latency. For example, occasionally some data packets on the same network may require 200 ms or more to traverse the network. These transient increases in traffic latency are very different from chronic problems, because they typically do not affect perceived service levels. Transient occurrences of excessive latency, jitter, or throughput are a normal part of network operations and do not warrant dedicated corrective action.
However, problems with chronic excessive performance problems and the consequential impact on business productivity and effectiveness do require explicit corrective action. This is particularly important, because the cost of resolving these problems can be a significant proportion of the support and management costs for network service providers.
Currently, there is no reliable mechanism for effectively recognizing or predicting the onset of periods of chronic performance problems. Service providers cannot distinguish between transient and persistent problems. The requirement to avoid expensive and unnecessary responses to transient problems means that the required corrective action for persistent problems is delayed, resulting in reduced business productivity and effectiveness.
Current network management tools allow the detection of excessive latency, excessive jitter, decreased throughput, and the like. Network management tools also provide for the generation of alerts informing network operators of certain problems. These techniques range in sophistication from simple ping echo tests to network monitoring devices performing accurate and precise testing and reporting, as described, for example, in U.S. Pat. Nos. 5,521,907, 6,058,102, and 6,147,998, the disclosures of which are incorporated herein by reference in their entireties. These patents describe how remote monitoring devices accurately and reliably measure performance metrics including point to point network latency, round-trip delay, throughput, and data delivery ratio. However, current network management tools cannot distinguish between chronic excessive problems and transient excessive problems. Nor can these tools predict the onset of chronic excessive problems. This deficiency degrades the ability of network managers to maintain high availability networks with minimal problems.
From a service provider's standpoint, it would be desirable to detect the onset of a persistent problem that requires attention before the client complains about poor performance. Accordingly, there remains a need for the capability to identify chronic performance problems on data networks in an automated manner, while accurately distinguishing such chronic performance problems from transient problems.