Network usage data is useful for many important business functions, such as subscriber billing, marketing & customer care, product development, network operations management, network and systems capacity planning, and security. Network usage data does not include the actual information exchanged in a communications session between parties, but rather includes numerous usage detail records, known as “flow records” containing one or more types of metadata (i.e., “data about data”). Known network flow records protocols include Netflow®, sFlow®, jFlow®, cFlow® and Netstream®. As used herein, a flow record is defined as a small unit of measure of unidirectional network usage by a stream of IP packets that share common source and destination parameters during a time interval.
The types of metadata included within each flow record vary based on the type of service and network involved and, in some cases, based on the particular network device providing the flow records. In general, a flow record provides detailed usage information about a particular event or communications connection between parties, such as the connection start time and stop time, source (or originator) of the data being transported, the destination or receiver of the data, and the amount of data transferred. A flow record summarizes usage information for very short periods of time (from milliseconds to seconds, occasionally minutes). Depending on the type of service and network involved, a flow record may also include information about the transfer protocol, the type of data transferred, the type of service (ToS) provided, etc. In telephony networks, the flow records that make up the usage information are referred to as call detail records (CDRs).
In network monitoring, the network flow records are collected, stored and analyzed to produce meaningful result. Network usage analysis systems process these flow records and generate reports or summarized data files that support various business functions. Network usage analysis systems provide information about how a network services are being used and by whom. Network usage analysis systems can also be used to identify (or predict) customer satisfaction-related issues, such as those caused by network congestion and network security abuse. In one example, network utilization and performance, as a function of subscriber usage behaviour, may be monitored to track a user's experience, to forecast future network capacity, or to identify usage behavior indicative of network abuse, fraud and theft.
As networks become larger and as more tasks are performed within the networks, such as transferring conventional telephone communications to Voice over IP (VOIP), the network flow on the data transactions can be voluminous and will quickly exceed storage and processing capacities.
In response to this problem of the large volume of the collected network flow information, one known solution uses sampling techniques to decrease data flow volume. Different sampling methods can be used by the network device to collect the information. Sampling can be done at the packet level or the flow level, and can be random or deterministic. Depending on which type of sampling method used, the effect will apply to CPU/memory utilization on the network device and/or bandwidth usage to export flow information to the collector. While the sampling may reduce the overall volume of collected network flow information, the total amount of data is often still voluminous. Furthermore, sampling does not address other problems within current network monitoring methodologies. For example, sampling techniques may not provide a proper picture of the network traffic because some data is being ignored in the process.
For example, another problem with current network monitoring methodologies is a contention in storage resources when trying to access the stored network flow information as additional network flow information is regularly being added. Typically, as network flow data is being accessed for analysis, new network flow information cannot be stored. Likewise, as new network flow information is in the process of being stored, the existing network flow data typically cannot be accessed.