The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, the approaches described in this section may not be prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
There are several definitions of the term “flow” being used by the Internet community. Within the context of the IETF's Internet Protocol Information eXport (IPFIX) Working Group, a flow is defined as a set of IP packets passing an observation point in the network during a certain time interval. All packets belonging to a particular flow share a set of common properties. Each property is defined as the result of applying a function to the values of: (1) one or more packet header fields (e.g. destination IP address), transport header fields (e.g. destination port number), or application header fields (e.g. RTP header fields); (2) one or more characteristics of the packet itself (e.g. number of MPLS labels, etc.); or (3) one or more fields derived from packet treatment (e.g. next hop IP address, the output interface, etc.). A packet belongs to a flow if the packet completely satisfies all the defined properties of the flow. This definition covers the range from a flow containing all packets observed at a network interface to a flow consisting of just a single packet between two applications. It includes packets selected by a sampling mechanism.
A variety of flow monitoring tools currently exist to monitor the flow of packets in networks. Flow monitoring tools provide valuable information that can be used in a variety of ways. For example, flow monitoring tools may be used to perform network traffic engineering and to provide network security services, e.g., to detect and address denial of service attacks. As yet another example, flow monitoring tools can be used to support usage-based network billing services.
Flow monitoring tools are conventionally implemented as flow monitoring processes executing on a network element, such as a router. The flow monitoring processes are configured to examine and classify packets passing through a particular observation point in a network. The flow monitoring processes are also configured to generate flow statistical data that indicates, for example, the number of packets in each flow, the number of bytes in each flow and the protocol of each flow.
One of the issues with flow monitoring tools is how to manage the consumption of resources attributable to generating and maintaining flow statistical data. Generating flow statistical data consumes processing resources and storing flow statistical data consumes storage resources. The amount of resources consumed by flow statistical data can be considerable in networks with high traffic volume, which can adversely affect other processes. Furthermore, the amount of resources consumed by flow statistical data can fluctuate dramatically, as network traffic patterns change.
One solution to this problem has been to use sampling to collect flow statistical data for less than all of the packets that pass through an observation point. For example, a percentage of packets are sampled, e.g., every nth packet is sampled, and then the exported flow statistical data is later adjusted to account for the percentage of packets that was sampled. As another example, a fixed probability may be used to determine whether to sample packets. One problem with these approaches is that they do not take into consideration the characteristics of traffic flow. Because of this, it is difficult to select a sampling percentage or probability that works well for both large and small flows. For example, a small sampling probability may work well for large flows but may not be effective for small flows because there may be too few packets to be sampled.
A conventional scheme to control storage consumption is to place a limit on the amount of memory used for storing flow statistical data. The limits are typically expressed as percentages of available resources or as absolute amounts.
One problem with this solution is that it can have significant unintended consequences on processing resource consumption. For example, when new flows arrive at an extremely high rate, because of the memory usage limitation, the existing flow statistical records would have to be removed at a very high rate in order to free up memory space for the new flows. Because export consumes processing resources, this causes processing consumption to surge, which is undesirable. Therefore, this scheme does not address the trade-off between memory and processing resource consumption.
Based on the foregoing, there is a need for an approach for managing the consumption of resources that does not suffer from limitations of prior approaches.