1. Field of the Invention
The present invention is related to sampling and analyzing packets in a network.
2. Brief Discussion of Related Art
Packet sampling is commonly employed in networks to sample network traffic for subsequent analysis of the packets. For example, packets may be sampled corresponding to a specific destination Internet protocol (IP) address to determine how many packets were sent to the IP address in a given period.
One conventional sampling approach that can be implemented samples every Nth packet (e.g., every 100th packet in the network traffic) without regard to the information contained by the packet. The sampling rate for this can be configurable so that more or fewer packets are sampled. Such random selection of packets to sample can be used to manage available hardware resources that are used for sampling, and later analyzing, the packets. However, this random approach may not provide a sufficient number of packets of interest to perform an accurate analysis of the network traffic represented by the sampled packets. For example, a user may be interested in content of the packets being sent between a source IP address and a destination IP address, where the packets originate at the source IP address and terminate at the destination IP address, or in all packets associated with a flow. Since only every Nth packet is sampled, there are packets between the source and destination IP addresses or within a flow that may not be sampled, which results in the inability to perform the desired analysis.
Another conventional sampling approach can sample all of the packets or none of the packets based on the packet's flow key. In this conventional sampling approach, all packets that match a predetermined flow key are sampled and packets that do not match the predetermined flow key are not. As a result of this sampling scheme, an analysis can be performed for a given flow, but information from other flows is unavailable to perform other analysis. As a result, if the same flow key is associated with many different flows, for example, having a few malicious packets per day, information about this traffic will likely go undetected.