The ability to rapidly identify and fix transmission errors in network systems has become increasingly important as more businesses have come to rely on applications and data that span multiple networks. Transmission packets in a computer network can be dropped at multiple points in the network resulting in applications that do not respond, experience poor performance as a result of packet retransmits, or behave unexpectedly. Dropped packets can be caused by a failing router or communication controller hardware, as well as by software defects. Isolating dropped packets requires traces to be gathered at multiple points in a network. Once collected, these traces must be correlated and analyzed by multiple experts as dictated by virtue of the various computing and network environments involved. This correlation and analysis process is manual and time consuming, which can result in long and costly problem isolation scenarios.
For example, a mainframe application may not be responding due to a partner application that did not respond, resulting in packets that did not reach their destination. To determine why and where packets were lost, traces would need to be collected on the mainframe, on transactions leaving the mainframe, at retransmission points, and at the target destination. Multiple experts would be required to analyze the packet traces and determine at which point packets were dropped. In another example, a mainframe application, using a mainframe operating system (e.g. OS/390) may stop functioning while communicating with an engineering workstation (e.g. RS/6000) application. Here, the packet flow could include going from the mainframe to a controller (e.g. 3174) then over a token ring LAN (local area network) to reach the engineering workstation. The network protocol could be any one known in the art including SNA (System Network Architecture) and IP (Internet Protocol). Experts would be required on the mainframe, on the controller, on the token ring LAN, and on the engineering workstation in order to fully analyze the problem. In addition, multiple recreations of the network failure would be required to diagnose the problem.