Network monitoring is commonly used to measure traffic data across links connected to a particular router within a packet-based network. The traffic data can be useful for analyzing protocols, traffic engineering, and network anomaly detection. An interface within the router operates at a link speed indicating how much information can traverse the interface in a specified timeframe. For example, an OC-48 link can transfer data at a rate of up to 2,488 megabits per second. Passive monitoring involves copying data packets off a link in a manner that does not substantially affect the performance of the link. A data packet contains information regarding its source, its destination, its protocol type, its size, and its payload. This information, along with the time when the data packet crossed the link, can be helpful in reconstructing flows of related packets with the same sources and destinations. The packet information captured during the monitoring activity is commonly referred to as a trace.
Passive monitoring involves tapping the link on which data needs to be collected and recording to disk either complete packets or partial packets, such as packet headers and timestamps indicating their arrival time. In the case of fiber-based networks, an optical splitter may split the optical signal, therefore effectively copying all of the data on the link, which may be received by a packet capture card on a personal computer (PC). Timestamps recorded by the capture card may be synchronized to a global positioning system (GPS) signal. Packets are temporarily stored on the capture board and then sent to the PC main memory over the PC's PCI bus.
Collecting packet traces at higher than OC-48 link speeds can be difficult for several reasons:                PCI bus throughput is already challenged at OC-48. During passive monitoring, the PCI bus is crossed twice for any data transfer: once from the capture board to the main memory, and a second time from the main memory to the hard disk.        Collecting data at OC-48 results in possibly terabytes of trace information per day in a point of presence (POP). At OC-192, the storage capacity must increase by a factor of four, and the challenge of managing such an enormous data set increases greatly as well.        Memory access speeds have not increased as quickly as the link speed.        Disk array speed has not kept up with link bandwidth. At OC-192 speed, a packet-level trace would require a disk bandwidth of roughly 250 megabytes per second.        
A passive monitoring infrastructure suitable for deployment for OC-192 links will benefit if it can perform some computation on-line so as to minimize the amount of data stored locally. But the computation must be simple—at OC-192 (10 Gbps) a new packet arrives every 240 ns on average (assuming 300-byte packets). This allows only 360 instructions per packet on the fastest processor currently available. Such a monitoring system may store the minimum amount of information necessary to simplify collection and storage. Sampling, such as copying every tenth packet rather than every packet, may be required in addition to compression.
One way to achieve these requirements is to store internet protocol (IP) packet data as flow traces instead of packet traces. A flow trace groups packets together that are from the same source and addressed to the same destination during a short time period. By collecting the related packets together, information that is common to all of the packets within a flow can be stored once for each flow, rather than with each packet. Since the common information can be removed from each data packet within the same flow, the resulting flow trace is compressed. With a compressed flow trace, less information is stored and processed, which reduces the resources required to collect data across higher speed links. Unfortunately, because the packets are no longer in chronological order, reconstructing the original arrival order of the packets from a flow-based trace requires sequentially reading the compressed flow trace file until the target packet is located.