With the advent of the Internet, modern computer applications not only operate on isolated systems but also communicate with each other on data communication networks. These network applications communicate with each other by sending and receiving packets to and from each other on the networks. Network applications capture, process and analyze the packets on the network in order to communication with each other.
Figure (“FIG.”) 1 is a diagram illustrating conventional communication between two network applications using the logical layers of a typical Transmission Control Protocol/Internet Protocol (TCP/IP) application. As shown in FIG. 1, two network applications, Peer A 102 and Peer B 104 communicate with each other using the layered TCP/IP protocol typically including 7 layers, some of which are omitted for clarity of explanation. Referring to FIG. 1, the layered TCP/IP protocol shown in FIG. 1 has an application layer (Layer 7) 106, 106′ for applications such as Hypertext Transfer Protocol (HTTP) and email exchanging logical application transactions, a transport layer (Layer 4) 108, 108′ for TCP or User Datagram Protocol (UDP) exchanging logical TCP streams, a network layer (Layer 3) 110, 110′ for IP packets exchanging logical IP, and a data link layer (Layer 2) 112, 112′ for the Ethernet, ATM, POS (“Packet-Over-Sonet”), etc. exchanging data packets. The final layer (not shown) of the TCP/IP protocol is the access method onto the actual wire transmitting the data. In general, the transport layer (Layer 4) may be implemented as a Transmission Control Protocol (TCP), Wireless Access Protocol (WAP), Stream Control Transmission Protocol (SCTP), or any other transport protocol that insures validity and integrity of the end-to-end data transmission.
When Peer A 102 initiates the communication to Peer B 104, the data to be transmitted is passed through 114 the TCP/IP layers 106, 108, 110, 112 until it is actually transmitted onto the wire 115. The data are packaged with a different header at each protocol layer. The receiving end Peer B 104 unpackages the received data, moving it back up to the stack 116 through the layers 112′, 110′ 108′, 106′ to the receiving application.
For successful analysis of the communication between the network applications 102, 104, a specialized non-intrusive packet collection system could be deployed. For successful analysis, such system should be able to capture, process, and analyze the packets received from other network applications in the correct order in which they were sent by the other applications. There are various commercial and open source applications performing packet analysis currently available for network applications. The success of these packet analysis applications depends upon their abilities to non-intrusively capture individual data transmission packets and restore the logical IP flows and TCP/UDP streams. Examples of packet analysis applications utilizing non-intrusive packet analysis include SNORT (The Open Source Network Intrusion Detection System), ETHEREAL (The Open Source Network Analyzer), the Carnivore System (FBI Internet Surveillance System), and the like. However, none of these conventional packet analysis applications are effective in TCP reconstruction for non-intrusive capturing and analysis of packets on a high speed distributed network, such as a full-duplex 100 Mbps network, Gigabit network, or POS network using separate physical channels for transmitting in opposite directions, especially when the packets are captured at different geographical locations of a distributed network.
FIG. 2 is a diagram illustrating an implementation of a conventional packet capturing and analysis system 200 commonly used for tapping a half-duplex Ethernet link. The packet capturing and analysis system 200 includes a Network Interface Card (“NIC”) 210, a main memory (MM) 212, a Central Processing Unit (“CPU”) 214, a storage module 216, and a timer 212. Peer A 202 and peer B 204 are network applications in computers communicating with each other on an Ethernet connection 206.
The packet capturing and analysis system 200 non-intrusively taps the Ethernet connection 206 between the two communication nodes, peer A 202 and peer B 204, by using devices such as a passive Ethernet hub, a passive Ethernet splitter, or a switch port mirroring device, to generate mirrored packets of the packet traveling on the Ethernet connection 206. The mirrored packets are captured using by the NIC 208. The captured packets are stored 209 in the NIC internal memory 210 and passed 211 in bulk to the MM 212 using Direct Memory Access (DMA) techniques. Once in a while (typically a few hundreds times in a second), the NIC 212 generates hardware interrupt 213 to inform the CPU 214 about the new set of packets ready for processing. Using the internal timer 218, the CPU 214 timestamps received packets and reconstructs Layer 3 (IP-to-IP), Layer 4 (TCP stream) and Layer 7 (Application) transactions using transaction reconstruction techniques well known to one skilled in the art. The results of the CPU-based analysis of the packets are stored in permanent storage 216 for future utilization. The functionalities of the NIC 208 and the sequential nature of the processes 209, 211, 213 of copying the packets ensure that the packets are presented to the CPU 214 in the same order in which the packets were presented to the Ethernet 206 by the communication peers 202, 204, regardless of the packet direction (from peer A to peer B or from peer B to peer A). This is very important for transaction reconstruction, because all conventional transaction reconstruction techniques operate based upon an assumption that the packet processing order correctly represents the inter-link behavior between peer A 202 and peer B 204.
In the reconstruction technique used by the system 200 of FIG. 2, the time stamping procedure is initiated by NIC interrupt 213 and performed by the CPU 214. Due to the relatively low interrupt rate (a few hundred times a second) and the relatively high packet arrival rate (up to 150,000 packets per second for a half-duplex 100 Mb Ethernet link), a large number (few thousands) of packets can be associated with one (non-precision) timestamp. This could be improved by using specialized NICs that employ an inter-NIC timer, however the “buffering” effect cannot be eliminated completely due to necessity of inter-NIC buffering and link delays. On the other hand, in case of the half-duplex Ethernet, the inaccuracy of the packet timestamping does not create a significant problem for transaction reconstruction because packets arrive to the CPU 214 at least in the same order as presented to the network 206 by applications 202 and 204.
FIG. 3 is a diagram illustrating a typical implementation of a conventional packet capturing and analysis system 300 for a full-duplex Ethernet link. The packet capturing and analysis system 300 includes a NIC card A 308, a NIC card B 310, a main memory (MM) 312, a Central Processing Unit (CPU) 314, a storage module 316, and a timer 318. Peer A 302 and peer B 304 are network applications in computers communicating with each other on a high-speed network using separate physical channels 305, 306 for transmitting packets in opposite directions. High-speed networks, such as a full-duplex 100 Mbps network, Gigabit network, or Packet-Over-Sonet (POS) network, typically use separate physical channels for transmitting in opposite directions.
The packet capturing and analysis system 300 non-intrusively taps the individual unidirectional links 305, 306 by using fiber-optic splitters (not shown) and provides the captured packets into the NICs 308, 310. The NICs 308, 310 are direction-specific, i.e., NIC 308 is only responsible for handling packets in the unidirectional link 305 for communication of packets from Peer A 302 to Peer B 304, and NIC 310 is only responsible for handling packets in the unidirectional link 306 for communication of packets from Peer B 304 to Peer A 302. The captured packets are stored in the MM 312, are time-stamped in response to hardware interrupts from the NICs 308, 310 and stored in the storage 316 for processing by the transaction reconstruction method.
With this configuration, packets from opposite directions could be presented into the MM 312 (and consequently to the transaction reconstruction method) in an order different from their original order in which they were presented to the links 305, 306, due to internal buffering and non-equal delays. Packet time-stamping provided by the timer 318 is not helpful as a criteria for packet reordering, because the timer 318 does not capture the original time of the transmission of the packet onto the links 305, 306, but captures the timing of the NIC (308, 310)-to-MM (312) transmission of the packets. The difference between these timings is small, typically in the range of 10-20 ms, but high-speed networks can transmit tens or hundreds of thousands of packets during this small interval. Some improvement can be achieved by synchronizing the NIC A 308 and the NIC B 310, however it requires expensive, specialized hardware. Moreover, this approach still does not eliminate the packet-reordering problem altogether.
FIG. 4 is a diagram illustrating an implementation of a conventional central packet analysis system 400 that operates in cooperation with two additional local packet capturing devices 402, 404 for a full-duplex Ethernet link in which the packets are captured at different locations. In a highly distributed network, it is sometimes impossible to tap opposite transmission links at a single geographical location. In such case, a distributed packet analysis configuration such as that shown in FIG. 4 is used. In such distributed configuration, packet capturing is carried out by packet capturing devices deployed as close as possible to the tapping point, such as the local packet capturing devices 402, 404, and the packet information from two or more capturing devices is delivered to the centralized packet analysis system 400.
Peer A 406 and peer B 408 are network applications in computers communicating with each other on a high-speed network using separate physical channels 410, 412 for transmitting packets in opposite directions. The central packet analysis system 400 includes a NIC card 414, a main memory (MM) 418, a CPU 422, a storage module 424, and a timer 420. The local packet capturing device 402 includes a NIC 426, a MM 430, a timer 432, and a CPU 434, and captures packets transmitted on the channel 410 from Peer A 406 to Peer B 408. The local packet capturing device 404 includes a NIC 436, a MM 440, a timer 442, and a CPU 444, and captures packets transmitted on the channel 412 from Peer B 408 to Peer A 406.
The local packet capturing device 402 captures packets transmitted on the channel 410 using NIC 426 and stores them temporarily in the internal memory 428. The packets are then stored in the MM 430 and time-stamped by the timer 432 in response to a hardware interrupt from the NIC 426 to the CPU 434. The local packet capturing device 404 captures the packets transmitted on the channel 412 using NIC 426, at a location distant from the location at which the local packet capturing device 402 captures packets, and stores them temporarily in the internal memory 438. The packets are stored in the MM 440 and time-stamped by the timer 442 in response to a hardware interrupt from the NIC 436 to the CPU 444. The packets are then transmitted from the local packet capturing devices 402, 404 to the central packet analysis system 400. The NIC 416 receives the packets from the local packet capturing devices 402, 404 and stores them in the internal memory 416. The packets are transferred to the MM 418, processed by 420, and the resulting reports are moved to the storage 424 for future processing.
Similarly to the packet capturing system 300 of FIG. 3, the centralized packet analysis system 400 often has packet-reordering problems caused by transmission delays in packet transmission, jitters, or non-perfect synchronizations of the local packet capturing devices' timers 432, 442. Further, in a distributed configuration such as that shown in FIG. 4, the packet re-ordering problem becomes more serious than that of the packet capturing system 300 in FIG. 3. This is because the timer inconsistence is in the order of 10-20 microseconds in a high-speed full-duplex network, which can lead to a time-stamping discrepancy at the central packet analysis system 400 in the order of a few seconds in the distributed capturing configuration. Considering that modern high-speed optical links can carry millions of packets per second, the out-of-order packet capturing problem becomes a major roadblock for packet analysis in distributed networks.
Timer mis-synchronization can be improved by using well-known synchronization techniques, such as a Network Time Protocol (NTP) synchronizing timers with under-second precision, and an External GPS clock capable of achieving 50 nanoseconds precision. However these techniques are associated with substantial additional cost and specialized equipment, and still cannot completely eliminate the timer mis-synchronization problem. Furthermore, the GPS synchronization technique requires installation of an external GPS antenna. Such requirements make these techniques unacceptable for many deployments and network configurations. In addition, unpredictable delays associated with buffering and interrupt latency cause time-stamping mistakes, even when precisions GPS timers are utilized.
An explanation of the practical problems associated with incorrect packet ordering will be provided below with reference to FIGS. 5 and 6.
FIG. 5 is an interaction diagram illustrating the typical packet sequence for an HTTP application communicating packets between a client 502 and a server 504. In FIG. 5, packets with labels starting with “A” are transmitted from the client 502 to the server 504 and packets with labels starting with “B” are transmitted from the server 504 to the client 502.
Referring to FIG. 5, in a typical HTTP application operating under the TCP/IP standard, the client sends a synchronization (SYN) packet (A1) to the server 504. The server 504 sends a synchronization-acknowledgment (SYN-ACK) packet (B1) back to the client 502. The client 502 sends an acknowledgement (ACK) packet (A2) and further sends a GET request (A3) to the server 504. In response, the server 504 sends responses (B2, B3, B4) to the GET request (A3) and a Finish (FIN) packet (B5) to the client 502. The client 502 sends Finish-Acknowledgement (FIN-ACK) (A4) to the server 504 and the server acknowledges receipt by sending ACK (B6) to the client 502.
FIG. 6 is an interaction diagram illustrating the packet sequence as seen by a CPU (not shown) and a transaction analysis application (not shown), when all the “client” side-originated 502 packets arrive at the CPU (not shown) after all the “server” side-originated 504 packets arrive. Since the client side 502 packets are time-stamped with a time later than the server side 504 packets, it causes a packet reordering problem. All the packets transmitted between the client 502 and the server 504 are identical to those described in FIG. 5, except that the order in which they are seen by the CPU (not shown) is different in that all the server side 504 packets (B1, B2, B3, B4, B5 packets) appear prior to the client side 502 packets (A1, A2, A3, A4 packets). Under such conditions, the CPU (not shown) has no information to enable itself to decide whether the packet sequence represents a single HTTP transaction or is a result of two half-captured transactions.
Starting sequential processing of the packets from packet B1 and so on, the CPU will incorrectly decide that the client side (A packets) communication was not captured due to capturing errors, even though in fact the A packets have been captured with at a later time with a later timestamp. This is because conventional transaction reconstruction methods consider packets in the order in which they are time-stamped and assume that missing packets in the time-stamped sequence of packets were lost during the packet capturing process. In other words, conventional transaction reconstruction methods are not capable of deferring the decision that a packet was lost. As such, conventional transaction reconstruction methods typically apply a “packet skip” procedure to handle the missing packets. As a result, the sequence of server-side packets (B1 through B6) will be analyzed as a partial HTTP response transaction and the client side packets (A1 through A4) will be analyzed as another independent client request, even though all the packets were in fact captured. When the client 502 and the server 504 communicate on a super-high speed and distributed optical network (such as an Internet backbone), these types of out-of-order and time synchronization problems become a significant factor in the correctness of the functionality and usefulness of the packet capturing and analysis system.
Therefore, in view of the above and many other shortcomings of the prior art, there is a need for a packet capturing and analysis system that can solve the out-of-order packet and time synchronization problems in a super-high speed distributed network environment. There is also a need for a packet capturing and analysis system that is capable of deferring the decision that a packet was lost until the system can be reasonably certain that the packet was indeed lost.