This invention relates to the field of network analysis, and in particular to a system and method for analyzing and assessing the effects of parallel delays within an application.
Proper management of a network generally requires assuring that the network is performing satisfactorily for the users of the network, modifying the network to address performance issues or problems, and planning for future improvements to the network as demand increases and as newer technologies and alternatives become available.
A variety of tools have been developed, and continue to be developed, to facilitate the management of communication networks, and in particular for managing networks that provide communications among computer devices. Many of these tools are configured to model the network's performance under a variety of traffic conditions, both real and hypothesized, and in many cases, base this performance on data collected from the actual network.
One of the primary parameters for analyzing or assessing the performance of a network is the time it takes for messages to reliably reach their destination. This time is dependent upon a variety of factors. The message is typically partitioned into transmission elements, herein termed packets for convenience. Each packet must enter the network, and incurs a delay as it gains network access. When it enters the network, it incurs a delay that is dependent upon the bandwidth available at each link along its path to its destination. It may also incur queuing delays as it passes through intermediate nodes, particularly at congested links. Upon arrival at the receiving node, a delay may also be incurred as the proper receipt of the message is verified. Some of these factors are constant, while others vary over time, typically dependent on network loading.
The effectiveness of a network analysis system is based on a number of factors, one of which is the system's ability to distinguish the variety of causes of message delay, and another is the system's ability to assess the effect of potential network modifications on each of these classes of delay. A variety of tools have been developed to distinguish the causes of message delay, including, for example, the techniques disclosed in copending U.S. patent application Ser. No. 11/776,736, “NETWORK CONGESTION DELAY ANALYSIS”, filed 12 Jul. 2007 for Steve Niemczyk, Patrick J. Malloy, Alain J. Cohen, and Russel Mark Elsner, and incorporated by reference herein. In this copending application, the various components of message delays are classified as bandwidth delay, propagation delay, protocol delay, congestion delay, and processing delay. By knowing the cause of the delays that a message incurs, potential solutions to reduce these delays can be determined. For example, if a significant portion of the delay is attributed to congestion delay, the node that is causing the ‘bottleneck’ can be identified, and the routing of messages may be modified to provide a more balanced distribution of traffic, with a corresponding reduction in the amount of traffic through the bottleneck node. In like manner, if a significant portion of the delay is attributed to bandwidth delay, additional channels between the nodes that are causing the bandwidth delay can be provided. In like manner, knowing the delay characteristics of a network provides opportunities for the developers of applications to optimize the applications by avoiding bottleneck paths, avoiding bursty traffic on bandwidth limited paths, and so on.
The use of conventional delay analysis and assessment techniques to identify potential improvements of performance, however, has significant limitations. Generally, the performance factors are not independent, such that an improvement in one delay factor is not necessarily reflected in the resultant delay. Conventional delay analysis techniques generally allocate/classify delays to the components in the ‘critical path’ of the message delay; that is, each component delay is determined by its direct effect on the overall message delay. Often, a reduction in one delay component merely reveals that another delay component that was not on the critical path is identified as being (another) major cause of the overall message delay. For example, if a particular communications link exhibits a significant bandwidth delay, the fact that there are delays caused by slow processing may be masked, particularly if a slow processor is providing data only slightly faster than the bandwidth-limited link can forward the data. Curing the bottleneck will not necessarily have a corresponding effect on the overall delay, because the data continues to be presented slowly, albeit into a wider bandwidth channel.
This lack of independence among delay components is particularly problematic for applications that employ parallelism. Consider, for example, an application that includes two tasks, one task that incurs a substantial processing delay, and another that incurs a substantial bandwidth delay. If these tasks are performed sequentially, the overall delay will correspond to the sum of these delays, and a reduction in either will be reflected in the overall delay. If these tasks are performed in parallel, however, the overall delay will correspond to the longer of the two delays, and a reduction in one of the delays will not necessarily affect the overall delay. Conventional delay analysis techniques that report the delays that are only on the critical path, and thus have a direct effect on the overall delay, provide little guidance as to the effect that a reduction of any delay component will have on the overall delay of an application that employs parallelism.
It should be noted that most network applications are affected by multiple delays on parallel paths, even if the application is not purposely designed to use parallelism. An event at one node may trigger, for example, parallel events on another node. Even though the application on the first node may be purely sequential, its response from the second node will be dependent upon the delays occurring on the parallel paths.
For ease of reference, the term ‘component delay’ is used herein to reference a delay in an application that can be eliminated by eliminating a single component, or type, of delay, and ‘parallel delay’ is used herein to refer to a delay in an application that can only be eliminated by two or more components of delay.
It would be advantageous to provide a method and system that identify parallel delays. It would also be advantageous to provide a method and system that facilitate the analysis of parallel delays. It would also be advantageous to provide a method and system that facilitate the identification of improvements that can be achieved by reducing one or more delay components within a network or within an application.
These advantages, and others, can be realized by a method and system that facilitate the analysis and assessment of application delays, including parallel delays. A trace file of an application's network events is processed to categorize the causes of delays incurred in the propagation and processing of these events. The system identifies the amount of delay (‘component delay’) that can be eliminated by eliminating each of the components of delay individually, as well as the amount of delay (‘parallel delay’) that can be eliminated by eliminating combinations of the delay components. A user interface displays the amount of reduction that can be achieved by eliminating each component delay individually and the amount of reduction that can be achieved by eliminating combinations of the individual component delays. To facilitate the analysis and assessment of these parallel delays, the interface allows the user to ‘drill down’ to view the individual delay components contained in each combination forming the parallel delays. In this manner, the user is provided a view of each of the delay components that would need to be addressed, either individually or in combination, to improve the overall application delay.
Throughout the drawings, the same reference numerals indicate similar or corresponding features or functions. The drawings are included for illustrative purposes and are not intended to limit the scope of the invention.