This invention relates to the field of network analysis, and in particular to a method and system for tracking message flow through nodes that disassociate the network addresses from some or all of the message packets, and/or transform the packet content.
The desire for high-speed communications often exceeds the capabilities of the physical communication channels on a network using conventional communication techniques. To satisfy the users' desire for higher speed, network or service providers often augment their channels with devices that optimize the throughput of the channels, without requiring changes to the manner in which the users' applications communicate.
In a number of network environments, one or more elements are configured to enhance the performance of the network by initiating actions that bypass or otherwise avoid the strict sequences associated with typical communication transactions. For example, some communication protocols call for an acknowledgement of receipt of a prior transmission by the destination node before sending a subsequent transmission, and some intermediate devices, commonly termed ‘proxy’ devices, are configured to avoid this requirement by ‘spoofing’ the transmitting node with an acknowledgement long before the destination node provides the actual acknowledgement. In another example, an intermediate device may be configured to ‘pre-fetch’ data on behalf of a requesting node, in anticipation of a request for that data by the requesting node.
WAN optimization devices, commonly termed WAN accelerators, have been developed to further enhance these delay-avoidance techniques by operating in tandem using specific protocols that are designed for such tandem devices. FIG. 1A illustrates a typical network configuration using a pair of WAN accelerators 20. Each network node 10 on either side of the network is substantially unaware of the presence of the accelerators 20, and each accelerator 20 is configured to operate as a proxy node for the network nodes 10, appearing as conventional destination network nodes 10. That is, the WAN accelerator 20 communicates with the node 10 using a conventional communication protocol common to nodes 10, and communicates with the other WAN accelerator 20 using a communication protocol that is designed to optimize communications between the accelerators 20. For example, each WAN accelerator 20 will generally include a large amount of storage for caching data that has been sent to the nodes 10, and each WAN accelerator 20 keeps track of the data that is stored at the other WAN accelerator 20. When a source node 10 subsequently initiates a transmission of some or all of a prior transmission to a destination node 10, the WAN accelerator 20 at the source side of the network merely sends a command to the WAN accelerator 20 on the destination side of the network to initiate a transmission of this previously stored data from its cache to the destination node, thereby avoiding an actual transmission of the data across the network, transparent to either the source or destination nodes. Other techniques for accelerating traffic flow are common in the art of WAN acceleration.
Because of the fact that WAN accelerators and similar optimizing devices are designed to be transparent to the end nodes, their relationship to these end nodes is also generally transparent to network analysis devices. In the example of FIG. 1A, a packet sent from node A to node B is conventionally encoded, identifying A as the source and B as the destination. When this conventionally encoded packet is received at accelerator X, the packet addressing information and content are encoded by accelerator X into an optimized form and transmitted as a message having X as the source and Y as the destination. When this optimized message is received at Y, the original message from A to B is recreated and transmitted as a conventionally encoded message with A as the source and B as the destination. Trace devices placed on the network will see A-to-B messages and X-to-Y messages, with no indication that the X-to-Y messages correspond to the transmission of the A-to-B messages across the X-to-Y link.
The lack of association of related messages significantly limits the effectiveness of network analysis systems. Network performance analysis systems need to be aware of which network traffic is associated with a given transaction, to understand the causalities within the transaction, and estimate how the processing of the transaction might be improved. For example, network analysis systems are often used to diagnose performance problems and to assess the performance of the network under varying conditions and configurations. If a user complains of degraded performance, a network analysis system will generally be used to collect the information from trace devices and track the path of the complaining user's messages to determine where the degradation is being introduced. When the link or device causing the problem is identified, and possible changes are considered for alleviating the problem, the network analysis system can be used to estimate the effect that each proposed change will have on curing or mitigating the reported problem.
In the example of FIG. 1A, for example, because the relationship between messages A-to-B and X-to-Y is transparent to conventional trace devices and analysis systems, the causal relationship between the performance of the X-Y link and the performance of the A-to-B link is absent, and conventional techniques for tracing, isolating, and diagnosing reported performance problems will be ineffective.
Even if the optimizing device does not modify the addresses, the modification of the conventional traffic flow, and/or the modification of the data content will often introduce a lack of correspondence between traces in a network analysis system. For example, a conventional ‘spoofing’ technique is for an accelerator to place a message in its buffer, acknowledge receipt of the message on behalf of the recipient, and then send the message on to the recipient. When the recipient receives the message, the recipient will send an acknowledgement back to the original source, and the accelerator will intercept the acknowledgement, because an acknowledgement for that message was already sent to the original source. However, when the trace files at the source and destination are processed, the receipt of the acknowledgement at the source will appear to have occurred before the sending of the acknowledgment from the destination, and the conventional network analysis system will assume that the time bases of the source and destination need to be modified to avoid this apparently impossible receipt-before-sending.
In like manner, conventional network analysis systems perform a process of ‘merging’ the trace files, to eliminate redundant records of the same message. Often, the size of a message is used as one of the criteria for determining whether two messages are identical. If the optimizing device changes the content, using compression techniques, for example, the determination that a record of the compressed message corresponds to another record of the uncompressed message may not be made, and both records will be included in subsequent statistical reports, traffic flow diagrams, and the like.
It would be advantageous to be able to recognize the correspondence among different forms of the same message. It would also be advantageous to use the correspondence among different forms of the same message to facilitate effective network analysis and diagnostics. It would also be advantageous to use the correspondence among different forms of the same message to determine the end-to-end characteristics and dependencies of messages sent between particular nodes. It would also be advantageous to recognize the correspondence among different forms of the same message with minimal user input requirements.
These advantages, and others, can be realized by a method and system that traces a path of messages communicated between nodes even when the messages may undergo transformation of their address and/or content, then includes the transforming devices in subsequent performance determinations and other system analysis tasks. A variety of techniques are presented for determining the path of the messages, depending upon the characteristics of the collected trace data. Upon determining the message path, the traces are synchronized in time, and correlations between the connections along the path are determined, including causal relationships. In a preferred embodiment, a user identifies an application process between or among particular nodes of a network, and the system provides a variety of formats for viewing statistics related to the performance of the application on the network.
Throughout the drawings, the same reference numerals indicate similar or corresponding features or functions. The drawings are included for illustrative purposes and are not intended to limit the scope of the invention.